(Brazil) Kinh Nghiệm Phát Triển

Advances in Intelligent Systems and Computing 1161
Álvaro Rocha · Hojjat Adeli ·

Luís Paulo Reis ·
Sandra Costanzo · Irena Orovic ·
Fernando Moreira Editors
Trends and
Innovations in
Information
Systems and
Technologies
Volume 3
Advances in Intelligent Systems and Computing
Volume 1161
Series Editor
Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences,
Warsaw, Poland
Advisory Editors
Nikhil R. Pal, Indian Statistical Institute, Kolkata, India
Rafael Bello Perez, Faculty of Mathematics, Physics and Computing,
Universidad Central de Las Villas, Santa Clara, Cuba
Emilio S. Corchado, University of Salamanca, Salamanca, Spain
Hani Hagras, School of Computer Science and Electronic Engineering,
University of Essex, Colchester, UK
László T. Kóczy, Department of Automation, Széchenyi István University,
Gyor, Hungary
Vladik Kreinovich, Department of Computer Science, University of Texas
at El Paso, El Paso, TX, USA
Chin-Teng Lin, Department of Electrical Engineering, National Chiao
Tung University, Hsinchu, Taiwan
Jie Lu, Faculty of Engineering and Information Technology,
University of Technology Sydney, Sydney, NSW, Australia
Patricia Melin, Graduate Program of Computer Science, Tijuana Institute
of Technology, Tijuana, Mexico
Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro,
Rio de Janeiro, Brazil
Ngoc Thanh Nguyen , Faculty of Computer Science and Management,
Wrocław University of Technology, Wrocław, Poland
Jun Wang, Department of Mechanical and Automation Engineering,
The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications
on theory, applications, and design methods of Intelligent Systems and Intelligent
Computing. Virtually all disciplines such as engineering, natural sciences, computer
and information science, ICT, economics, business, e-commerce, environment,
healthcare, life science are covered. The list of topics spans all the areas of modern
intelligent systems and computing such as: computational intelligence, soft comput-
ing including neural networks, fuzzy systems, evolutionary computing and the fusion
of these paradigms, social intelligence, ambient intelligence, computational neuro-
science, artificial life, virtual worlds and society, cognitive science and systems,
Perception and Vision, DNA and immune based systems, self-organizing and
adaptive systems, e-Learning and teaching, human-centered and human-centric
computing, recommender systems, intelligent control, robotics and mechatronics
including human-machine teaming, knowledge-based paradigms, learning para-
digms, machine ethics, intelligent data analysis, knowledge management, intelligent
agents, intelligent decision making and support, intelligent network security, trust
management, interactive entertainment, Web intelligence and multimedia.
The publications within “Advances in Intelligent Systems and Computing” are
primarily proceedings of important conferences, symposia and congresses. They
cover significant recent developments in the field, both of a foundational and
applicable character. An important characteristic feature of the series is the short
publication time and world-wide distribution. This permits a rapid and broad
dissemination of research results.
** Indexing: The books of this series are submitted to ISI Proceedings,
EI-Compendex, DBLP, SCOPUS, Google Scholar and Springerlink **
More information about this series at http://www.springer.com/series/11156

Álvaro Rocha Hojjat Adeli
• •
Luís Paulo Reis Sandra Costanzo

• •
Irena Orovic Fernando Moreira

•
Editors
Trends and Innovations

in Information Systems
and Technologies
Volume 3
123
Editors
Álvaro Rocha Hojjat Adeli
Departamento de Engenharia Informática College of Engineering
Universidade de Coimbra The Ohio State University
Coimbra, Portugal Columbus, OH, USA
Luís Paulo Reis Sandra Costanzo

FEUP DIMES
Universidade do Porto Università della Calabria
Porto, Portugal Arcavacata, Italy
Irena Orovic Fernando Moreira

Faculty of Electrical Engineering Universidade Portucalense
University of Montenegro Porto, Portugal
Podgorica, Montenegro
ISSN 2194-5357 ISSN 2194-5365 (electronic)

Advances in Intelligent Systems and Computing
ISBN 978-3-030-45696-2 ISBN 978-3-030-45697-9 (eBook)
https://doi.org/10.1007/978-3-030-45697-9
© The Editor(s) (if applicable) and The Author(s), under exclusive license
to Springer Nature Switzerland AG 2020
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are exempt from
the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the
authors or the editors give a warranty, express or implied, with respect to the material contained herein or
for any errors or omissions that may have been made. The publisher remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
This book contains a selection of papers accepted for presentation and discussion at
the 2020 World Conference on Information Systems and Technologies
(WorldCIST’20). This conference had the support of the IEEE Systems, Man, and
Cybernetics Society (IEEE SMC), Iberian Association for Information Systems and
Technologies/Associação Ibérica de Sistemas e Tecnologias de Informação
(AISTI), Global Institute for IT Management (GIIM), University of Montengero,
Mediterranean University and Faculty for Business in Tourism of Budva. It took
place at Budva, Montenegro, during 7–10 April 2020.
The World Conference on Information Systems and Technologies (WorldCIST)
is a global forum for researchers and practitioners to present and discuss recent
results and innovations, current trends, professional experiences and challenges of
modern information systems and technologies research, technological development
and applications. One of its main aims is to strengthen the drive towards a holistic
symbiosis between academy, society and industry. WorldCIST’20 built on the
successes of WorldCIST’13 held at Olhão, Algarve, Portugal; WorldCIST’14 held
at Funchal, Madeira, Portugal; WorldCIST’15 held at São Miguel, Azores,
Portugal; WorldCIST’16 held at Recife, Pernambuco, Brazil; WorldCIST’17 held
at Porto Santo, Madeira, Portugal; WorldCIST’18 held at Naples, Italy and
WorldCIST’19 which took place at La Toja, Spain.
The program committee of WorldCIST’20 was composed of a multidisciplinary
group of almost 300 experts and those who are intimately concerned with infor-
mation systems and technologies. They have had the responsibility for evaluating,
in a ‘blind review’ process, the papers received for each of the main themes pro-
posed for the conference: (A) Information and Knowledge Management;
(B) Organizational Models and Information Systems; (C) Software and Systems
Modelling; (D) Software Systems, Architectures, Applications and Tools;
(E) Multimedia Systems and Applications; (F) Computer Networks, Mobility and
Pervasive Systems; (G) Intelligent and Decision Support Systems; (H) Big Data
Analytics and Applications; (I) Human–Computer Interaction; (J) Ethics,
Computers and Security; (K) Health Informatics; (L) Information Technologies in
v
vi Preface
Education; (M) Information Technologies in Radiocommunications;

(N) Technologies for Biomedical Applications.
The conference also included workshop sessions taking place in parallel with the
conference ones. Workshop sessions covered themes such as (i) Innovative
Technologies Applied to Rural; (ii) Network Modelling, Learning and Analysis;
(iii) Intelligent Systems and Machines; (iv) Healthcare Information Systems
Interoperability, Security and Efficiency; (v) Applied Statistics and Data Analysis
using Computer Science; (vi) Cybersecurity for Smart Cities Development;
(vii) Education through ICT; (viii) Unlocking the Artificial Intelligence Interplay
with Business Innovation (ix) and Pervasive Information Systems.
WorldCIST’20 received about 400 contributions from 57 countries around the
world. The papers accepted for presentation and discussion at the conference are
published by Springer (this book) in three volumes and will be submitted for
indexing by ISI, EI-Compendex, SCOPUS, DBLP and/or Google Scholar, among
others. Extended versions of selected best papers will be published in special or
regular issues of relevant journals, mainly SCI/SSCI and Scopus/EI-Compendex
indexed journals.
We acknowledge all of those that contributed to the staging of WorldCIST’20
(authors, committees, workshop organizers and sponsors). We deeply appreciate
their involvement and support that were crucial for the success of WorldCIST’20.
April 2020 Álvaro Rocha

Hojjat Adeli
Luís Paulo Reis
Sandra Costanzo
Irena Orovic
Fernando Moreira
Organization
Conference
General Chair
Álvaro Rocha University of Coimbra, Portugal
Co-chairs
Hojjat Adeli The Ohio State University, USA
Luis Paulo Reis University of Porto, Portugal
Sandra Costanzo University of Calabria, Italy
Local Organizing Committee

Irena Orovic (Chair) University of Montenegro, Montenegro
Milos Dakovic University of Montenegro, Montenegro
Andjela Draganic University of Montenegro, Montenegro
Milos Brajovic University of Montenegro, Montenegro
Snezana Scepanvic Mediterranean University, Montenegro
Rade Ratkovic Faculty of Business and Tourism, Montenegro
Advisory Committee
Ana Maria Correia (Chair) University of Sheffield, UK
Benjamin Lev Drexel University, USA
Chatura Ranaweera Wilfrid Laurier University, Canada
Chris Kimble KEDGE Business School and MRM, UM2,
Montpellier, France
Erik Bohlin Chalmers University of Technology, Sweden
Eva Onaindia Polytechnical University of Valencia, Spain
Gintautas Dzemyda Vilnius University, Lithuania
vii
viii Organization
Janusz Kacprzyk Polish Academy of Sciences, Poland

Jason Whalley Northumbria University, UK
João Tavares University of Porto, Portugal
Jon Hall The Open University, UK
Justin Zhang University of North Florida, USA
Karl Stroetmann Empirica Communication and Technology
Research, Germany
Kathleen Carley Carnegie Mellon University, USA
Keng Siau Missouri University of Science and Technology,
USA
Manlio Del Giudice University of Rome Link Campus, Italy
Michael Koenig Long Island University, USA
Miguel-Angel Sicilia University of Alcalá, Spain
Reza Langari Texas A&M University, USA
Vedat Verter McGill University, Canada
Vishanth Weerakkody Bradford University, UK
Wim Van Grembergen University of Antwerp, Belgium
Program Committee
Abdul Rauf RISE SICS, Sweden
Adnan Mahmood Waterford Institute of Technology, Ireland
Adriana Peña Pérez Negrón Universidad de Guadalajara, Mexico
Adriani Besimi South East European University, Macedonia
Agostinho Sousa Pinto Polytechnic of Porto, Portugal
Ahmed El Oualkadi Abdelmalek Essaadi University, Morocco
Ahmed Rafea American University in Cairo, Egypt
Alberto Freitas FMUP, University of Porto, Portugal
Aleksandra Labus University of Belgrade, Serbia
Alexandru Vulpe University Politehnica of Bucharest, Romania
Ali Idri ENSIAS, University Mohammed V, Morocco
Amélia Badica Universti of Craiova, Romania
Amélia Cristina Ferreira Silva Polytechnic of Porto, Portugal
Almir Souza Silva Neto IFMA, Brazil
Amit Shelef Sapir Academic College, Israel
Ana Isabel Martins University of Aveiro, Portugal
Ana Luis University of Coimbra, Portugal
Anabela Tereso University of Minho, Portugal
Anacleto Correia CINAV, Portugal
Anca Alexandra Purcarea University Politehnica of Bucharest, Romania
Andjela Draganic University of Montenegro, Montenegro
Aneta Polewko-Klim University of Białystok, Institute of Informatics,
Poland
Aneta Poniszewska-Maranda Lodz University of Technology, Poland
Angeles Quezada Instituto Tecnologico de Tijuana, Mexico
Organization ix
Anis Tissaoui University of Jendouba, Tunisia

Ankur Singh Bist KIET, India
Ann Svensson University West, Sweden
Antoni Oliver University of the Balearic Islands, Spain
Antonio Jiménez-Martín Universidad Politécnica de Madrid, Spain
Antonio Pereira Polytechnic of Leiria, Portugal
Armando Toda University of São Paulo, Brazil
Arslan Enikeev Kazan Federal University, Russia
Benedita Malheiro Polytechnic of Porto, ISEP, Portugal
Boris Shishkov ULSIT/IMI-BAS/IICREST, Bulgaria
Borja Bordel Universidad Politécnica de Madrid, Spain
Branko Perisic Faculty of Technical Sciences, Serbia
Bruno Veloso INESC TEC, Portugal
Carla Pinto Polytechnic of Porto, ISEP, Portugal
Carla Santos Pereira Universidade Portucalense, Portugal
Catarina Reis Polytechnic of Leiria, Portugal
Cengiz Acarturk Middle East Technical University, Turkey
Cesar Collazos Universidad del Cauca, Colombia
Christophe Feltus LIST, Luxembourg
Christophe Soares University Fernando Pessoa, Portugal
Christos Bouras University of Patras, Greece
Christos Chrysoulas London South Bank University, UK
Christos Troussas University of Piraeus, Greece
Ciro Martins University of Aveiro, Portugal
Claudio Sapateiro Polytechnic of Setúbal, Portugal
Costin Badica University of Craiova, Romania
Cristian García Bauza PLADEMA-UNICEN-CONICET, Argentina
Cristian Mateos ISISTAN-CONICET, UNICEN, Argentina
Daria Bylieva Peter the Great St.Petersburg Polytechnic
University, Russia
Dante Carrizo Universidad de Atacama, Chile
Dayana Spagnuelo Vrije Universiteit Amsterdam, Netherlands
Dušan Barać University of Belgrade, Serbia
Edita Butrime Lithuanian University of Health Sciences,
Lithuania
Edna Dias Canedo University of Brasilia, Brazil
Eduardo Santos Pontifical Catholic University of Paraná, Brazil
Egils Ginters Riga Technical University, Latvia
Ekaterina Isaeva Perm State University, Russia
Elena Mikhailova ITMO University, Russia
Eliana Leite University of Minho, Portugal
Erik Fernando Mendez Autonomous Regional University of the Andes,
Garcea Ecuador
Eriks Sneiders Stockholm University, Sweden
Esteban Castellanos ESPE, Ecuador
x Organization
Faisal Musa Abbas Abubakar Tafawa Balewa University Bauchi,

Nigeria
Fatima Azzahra Amazal Ibn Zohr University, Morocco
Fernando Almeida INESC TEC and University of Porto, Portugal
Fernando Bobillo University of Zaragoza, Spain
Fernando Molina-Granja National University of Chimborazo, Ecuador
Fernando Moreira Portucalense University, Portugal
Fernando Ribeiro Polytechnic Castelo Branco, Portugal
Filipe Caldeira Polytechnic of Viseu, Portugal
Filipe Portela University of Minho, Portugal
Filipe Sá Polytechnic of Viseu, Portugal
Filippo Neri University of Naples, Italy
Firat Bestepe Republic of Turkey Ministry of Development,
Turkey
Francesco Bianconi Università degli Studi di Perugia, Italy
Francisco García-Peñalvo University of Salamanca, Spain
Francisco Valverde Universidad Central del Ecuador, Ecuador
Galim Vakhitov Kazan Federal University, Russia
Gayo Diallo Univsersity of Bordeaux, France
George Suciu BEIA Consult International, Romania
Gheorghe Sebestyen Technical University of Cluj-Napoca, Romania
Ghani Albaali Princess Sumaya University for Technology,
Jordan
Gian Piero Zarri University Paris-Sorbonne, France
Giuseppe Di Massa University of Calabria, Italy
Gonçalo Paiva Dias University of Aveiro, Portugal
Goreti Marreiros ISEP/GECAD, Portugal
Graciela Lara López University of Guadalajara, Mexico
Habiba Drias University of Science and Technology Houari
Boumediene, Algeria
Hafed Zarzour University of Souk Ahras, Algeria
Hamid Alasadi Basra University, Iraq
Hatem Ben Sta University of Tunis at El Manar, Tunisia
Hector Fernando Gomez Universidad Tecnica de Ambato, Ecuador
Alvarado
Hélder Gomes University of Aveiro, Portugal
Helia Guerra University of the Azores, Portugal
Henrique da Mota Silveira University of Campinas (UNICAMP), Brazil
Henrique S. Mamede University Aberta, Portugal
Hing Kai Chan University of Nottingham Ningbo China, China
Hugo Paredes INESC TEC and University of Trás-os-Montes
e Alto Douro, Portugal
Ibtissam Abnane Mohamed V University in Rabat, Morocco
Igor Aguilar Alonso Universidad Nacional Tecnológica de Lima Sur,
Peru
Organization xi
Imen Ben Said Université de Sfax, Tunisia

Inês Domingues University of Coimbra, Portugal
Isabel Lopes Polytechnic of Bragança, Portugal
Isabel Pedrosa Coimbra Business School ISCAC, Portugal
Isaías Martins University of Leon, Spain
Issam Moghrabi Gulf University for Science and Technology,
Kuwait
Ivan Dunđer University of Zabreb, Croatia
Ivan Lukovic University of Novi Sad, Serbia
Jaime Diaz University of La Frontera, Chile
Jan Kubicek Technical University of Ostrava, Czech Republic
Jean Robert Kala Kamdjoug Catholic University of Central Africa, Cameroon
Jesús Gallardo Casero University of Zaragoza, Spain
Jezreel Mejia CIMAT, Unidad Zacatecas, Mexico
Jikai Li The College of New Jersey, USA
Jinzhi Lu KTH Royal Institute of Technology, Sweden
Joao Carlos Silva IPCA, Portugal
João Manuel R. S. Tavares University of Porto, FEUP, Portugal
João Paulo Pereira Polytechnic of Bragança, Portugal
João Reis University of Aveiro, Portugal
João Reis University of Lisbon, Portugal
João Rodrigues University of the Algarve, Portugal
João Vidal Carvalho Polytechnic of Coimbra, Portugal
Joaquin Nicolas Ros University of Murcia, Spain
Jorge Barbosa Polytechnic of Coimbra, Portugal
Jorge Buele Technical University of Ambato, Ecuador
Jorge Esparteiro Garcia Polytechnic Institute of Viana do Castelo,
Portugal
Jorge Gomes University of Lisbon, Portugal
Jorge Oliveira e Sá University of Minho, Portugal
José Álvarez-García University of Extremadura, Spain
José Braga de Vasconcelos Universidade New Atlântica, Portugal
Jose Luis Herrero Agustin University of Extremadura, Spain
José Luís Reis ISMAI, Portugal
Jose Luis Sierra Complutense University of Madrid, Spain
Jose M. Parente de Oliveira Aeronautics Institute of Technology, Brazil
José Machado University of Minho, Portugal
José Paulo Lousado Polytechnic of Viseu, Portugal
Jose Torres Universidty Fernando Pessoa, Portugal
José-Luís Pereira Universidade do Minho, Portugal
Juan M. Santos University of Vigo, Spain
Juan Manuel Carrillo de Gea University of Murcia, Spain
Juan Pablo Damato UNCPBA-CONICET, Argentina
Juncal Gutiérrez-Artacho University of Granada, Spain
Kalinka Kaloyanova Sofia University, Bulgaria
xii Organization
Kamel Rouibah Kuwait University, Kuwait

Khalid Benali LORIA University of Lorraine, France
Korhan Gunel Adnan Menderes University, Turkey
Krzysztof Wolk Polish-Japanese Academy of Information
Technology, Poland
Kuan Yew Wong Universiti Teknologi Malaysia (UTM), Malaysia
Laila Cheikhi University Mohammed V, Rabat, Morocco
Laura Varela-Candamio Universidade da Coruña, Spain
Laurentiu Boicescu E.T.T.I. U.P.B., Romania
Leonardo Botega University Centre Eurípides of Marília
(UNIVEM), Brazil
Leonid Leonidovich Moscow Aviation Institute (National Research
Khoroshko University), Russia
Lia-Anca Hangan Technical University of Cluj-Napoca, Romania
Lila Rao-Graham University of the West Indies, Jamaica
Łukasz Tomczyk Pedagogical University of Cracow, Poland
Luis Alvarez Sabucedo University of Vigo, Spain
Luis Cavique University Aberta, Portugal
Luis Gouveia University Fernando Pessoa, Portugal
Luis Mendes Gomes University of the Azores, Portugal
Luis Silva Rodrigues Polythencic of Porto, Portugal
Luiz Rafael Andrade Tiradentes University, Brazil
Luz Sussy Bayona Oré Universidad Nacional Mayor de San Marcos,
Peru
Maksim Goman JKU, Austria
Manal el Bajta ENSIAS, Morocco
Manuel Antonio Technical University of Madrid, Spain
Fernández-Villacañas
Marín
Manuel Silva Polytechnic of Porto and INESC TEC, Portugal
Manuel Tupia Pontifical Catholic University of Peru, Peru
Manuel Au-Yong-Oliveira University of Aveiro, Portugal
Marciele Bernardes University of Minho, Brazil
Marco Bernardo Polytechnic of Viseu, Portugal
Marco Ronchetti Universita’ di Trento, Italy
Mareca María PIlar Universidad Politécnica de Madrid, Spain
Marek Kvet Zilinska Univerzita v Ziline, Slovakia
María de la Cruz del University of Vigo, Spain
Río-Rama
Maria João Ferreira Universidade Portucalense, Portugal
Maria João Varanda Pereira Polytechnic of Bragança, Portugal
Maria José Angélico Polytechnic of Porto, Portugal
Maria José Sousa University of Coimbra, Portugal
María Teresa García-Álvarez University of A Coruna, Spain
Mariam Bachiri ENSIAS, Morocco
Organization xiii
Marijana Despotovic-Zrakic Faculty Organizational Science, Serbia

Mário Antunes Polytechnic of Leiria and CRACS INESC TEC,
Portugal
Marisa Maximiano Polytechnic Institute of Leiria, Portugal
Marisol Garcia-Valls Polytechnic University of Valencia, Spain
Maristela Holanda University of Brasilia, Brazil
Marius Vochin E.T.T.I. U.P.B., Romania
Marlene Goncalves da Silva Universidad Simón Bolívar, Venezuela
Maroi Agrebi University of Polytechnique Hauts-de-France,
France
Martin Henkel Stockholm University, Sweden
Martín López Nores University of Vigo, Spain
Martin Zelm INTEROP-VLab, Belgium
Mawloud Mosbah University 20 Août 1955 of Skikda, Algeria
Michal Adamczak Poznan School of Logistics, Poland
Michal Kvet University of Zilina, Slovakia
Miguel António Sovierzoski Federal University of Technology - Paraná,
Brazil
Mihai Lungu University of Craiova, Romania
Mircea Georgescu Al. I. Cuza University of Iasi, Romania
Mirna Muñoz Centro de Investigación en Matemáticas A.C.,
Mexico
Mohamed Hosni ENSIAS, Morocco
Monica Leba University of Petrosani, Romania
Mu-Song Chen Da-Yeh University, China
Natalia Grafeeva Saint Petersburg University, Russia
Natalia Miloslavskaya National Research Nuclear University MEPhI,
Russia
Naveed Ahmed University of Sharjah, United Arab Emirates
Neeraj Gupta KIET Group of Institutions Ghaziabad, India
Nelson Rocha University of Aveiro, Portugal
Nikolai Prokopyev Kazan Federal University, Russia
Niranjan S. K. JSS Science and Technology University, India
Noemi Emanuela Cazzaniga Politecnico di Milano, Italy
Noureddine Kerzazi Polytechnique Montréal, Canada
Nuno Melão Polytechnic of Viseu, Portugal
Nuno Octávio Fernandes Polytechnic of Castelo Branco, Portugal
Olimpiu Stoicuta University of Petrosani, Romania
Patricia Zachman Universidad Nacional del Chaco Austral,
Argentina
Patrick C.-H. Soh Multimedia University, Malaysia
Paula Alexandra Rego Polytechnic of Viana do Castelo and LIACC,
Portugal
Paulo Maio Polytechnic of Porto, ISEP, Portugal
Paulo Novais University of Minho, Portugal
xiv Organization
Paulvanna Nayaki Marimuthu Kuwait University, Kuwait

Paweł Karczmarek The John Paul II Catholic University of Lublin,
Poland
Pedro Rangel Henriques University of Minho, Portugal
Pedro Sobral University Fernando Pessoa, Portugal
Pedro Sousa University of Minho, Portugal
Philipp Brune Neu-Ulm University of Applied Sciences,
Germany
Piotr Kulczycki Systems Research Institute, Polish Academy
of Sciences, Poland
Prabhat Mahanti University of New Brunswick, Canada
Rabia Azzi Bordeaux University, France
Radu-Emil Precup Politehnica University of Timisoara, Romania
Rafael Caldeirinha Polytechnic of Leiria, Portugal
Rafael M. Luque Baena University of Malaga, Spain
Rahim Rahmani University Stockholm, Sweden
Raiani Ali Hamad Bin Khalifa University, Qatar
Ramayah T. Universiti Sains Malaysia, Malaysia
Ramiro Gonçalves University of Trás-os-Montes e Alto Douro
& INESC TEC, Portugal
Ramon Alcarria Universidad Politécnica de Madrid, Spain
Ramon Fabregat Gesa University of Girona, Spain
Renata Maria Maracho Federal University of Minas Gerais, Brazil
Reyes Juárez Ramírez Universidad Autonoma de Baja California,
Mexico
Rui Jose University of Minho, Portugal
Rui Pitarma Polytechnic Institute of Guarda, Portugal
Rui S. Moreira UFP & INESC TEC & LIACC, Portugal
Rustam Burnashev Kazan Federal University, Russia
Saeed Salah Al-Quds University, Palestine
Said Achchab Mohammed V University in Rabat, Morocco
Sajid Anwar Institute of Management Sciences Peshawar,
Pakistan
Sami Habib Kuwait University, Kuwait
Samuel Sepulveda University of La Frontera, Chile
Sanaz Kavianpour University of Technology, Malaysia
Sandra Patricia Cano University of San Buenaventura Cali, Colombia
Mazuera
Savo Tomovic University of Montenegro, Montenegro
Sassi Sassi FSJEGJ, Tunisia
Seppo Sirkemaa University of Turku, Finland
Sergio Albiol-Pérez University of Zaragoza, Spain
Shahed Mohammadi Ayandegan University, Iran
Shahnawaz Talpur Mehran University of Engineering & Technology
Jamshoro, Pakistan
Organization xv
Silviu Vert Politehnica University of Timisoara, Romania

Simona Mirela Riurean University of Petrosani, Romania
Slawomir Zolkiewski Silesian University of Technology, Poland
Solange N. Alves-Souza University of São Paulo, Brazil
Solange Rito Lima University of Minho, Portugal
Sonia Sobral Portucalense University, Portugal
Sorin Zoican Polytechnic University of Bucharest, Romania
Souraya Hamida Batna 2 University, Algeria
Sümeyya Ilkin Kocaeli University, Turkey
Syed Nasirin Universiti Malaysia Sabah, Malaysia
Taoufik Rachad University Mohamed V, Morocco
Tatiana Antipova Institute of Certified Specialists, Russia
Teresa Guarda University Estatal Peninsula de Santa Elena,
Ecuador
Tero Kokkonen JAMK University of Applied Sciences, Finland
The Thanh Van HCMC University of Food Industry, Vietnam
Thomas Weber EPFL, Switzerland
Timothy Asiedu TIM Technology Services Ltd., Ghana
Tom Sander New College of Humanities, Germany
Tomaž Klobučar Jozef Stefan Institute, Slovenia
Toshihiko Kato University of Electro-Communications, Japan
Tzung-Pei Hong National University of Kaohsiung, Taiwan
Valentina Colla Scuola Superiore Sant’Anna, Italy
Veronica Segarra Faggioni Private Technical University of Loja, Ecuador
Victor Alves University of Minho, Portugal
Victor Georgiev Kazan Federal University, Russia
Victor Kaptelinin Umeå University, Sweden
Vincenza Carchiolo University of Catania, Italy
Vitalyi Igorevich Talanin Zaporozhye Institute of Economics
and Information Technologies, Ukraine
Wafa Mefteh Tunisia
Wolf Zimmermann Martin Luther University Halle-Wittenberg,
Germany
Yadira Quiñonez Autonomous University of Sinaloa, Mexico
Yair Wiseman Bar-Ilan University, Israel
Yuhua Li Cardiff University, UK
Yuwei Lin University of Roehampton, UK
Yves Rybarczyk Dalarna University, Sweden
Zorica Bogdanovic University of Belgrade, Serbia
Contents
Health Informatics
A Product and Service Concept Proposal to Improve the Monitoring
of Citizens’ Health in Society at Large . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Luís Fonseca, João Barroso, Miguel Araújo, Rui Frazão,
and Manuel Au-Yong-Oliveira
Artificial Neural Networks Interpretation Using LIME for Breast
Cancer Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Hajar Hakkoum, Ali Idri, and Ibtissam Abnane
Energy Efficiency and Usability of Web-Based Personal
Health Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
José Alberto García-Berná, Sofia Ouhbi, José Luis Fernández-Alemán,
Juan Manuel Carrillo-de-Gea, and Joaquín Nicolás
A Complete Prenatal Solution for a Reproductive Health
Unit in Morocco . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Mariam Bachiri, Ali Idri, Taoufik Rachad, Hassan Alami,
and Leanne M. Redman
Machine Learning and Image Processing for Breast Cancer:
A Systematic Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Hasnae Zerouaoui, Ali Idri, and Khalid El Asnaoui
A Definition of a Coaching Plan to Guide Patients with Chronic
Obstructive Respiratory Diseases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Diogo Martinho, Ana Vieira, João Carneiro, Constantino Martins,
Ana Almeida, and Goreti Marreiros
Reviewing Data Analytics Techniques in Breast Cancer Treatment . . . . 65
Mahmoud Ezzat and Ali Idri
xvii
xviii Contents
Enabling Smart Homes Through Health Informatics

and Internet of Things for Enhanced Living Environments . . . . . . . . . . 76
Gonçalo Marques and Rui Pitarma
MyContraception: An Evidence-Based Contraception mPHR
for Better Contraceptive Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Manal Kharbouch, Ali Idri, Taoufiq Rachad, Hassan Alami,
Leanne Redman, and Youssef Stelate
Predictors of Acceptance and Rejection of Online Peer Support
Groups as a Digital Wellbeing Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
John McAlaney, Manal Aldhayan, Mohamed Basel Almourad,
Sainabou Cham, and Raian Ali
Assessing Daily Activities Using a PPG Sensor Embedded
in a Wristband-Type Activity Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Alexandra Oliveira, Joyce Aguiar, Eliana Silva, Brígida Mónica Faria,
Helena Gonçalves, Luís Teófilo, Joaquim Gonçalves, Victor Carvalho,
Henrique Lopes Cardoso, and Luís Paulo Reis
Simulation of a Robotic Arm Controlled by an LCD Touch Screen
to Improve the Movements of Physically Disabled People . . . . . . . . . . . 120
Yadira Quiñonez, Oscar Zatarain, Carmen Lizarraga, Juan Peraza,
Rogelio Estrada, and Jezreel Mejía
Information Technologies in Education

Performance Indicator Based on Learning Routes: Second Round . . . . 137
Franklin Chamba, Susana Arias, Gustavo Alvarez, and Héctor Gómez
Evaluating the Acceptance of Blended-Learning Tools:
A Case Study Using SlideWiki Presentation Rooms . . . . . . . . . . . . . . . . 142
Anne Martin, Bianca Bergande, and Roy Meissner
Adaptivity: A Continual Adaptive Online Knowledge
Assessment System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Miran Zlatović and Igor Balaban
The First Programming Language and Freshman Year
in Computer Science: Characterization and Tips for Better
Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Sónia Rolland Sobral
Design of a Network Learning System for the Usage
of Surgical Instruments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
Ting-Kai Hwang, Bih-Huang Jin, and Su-Chiu Wang
CS1 and CS2 Curriculum Recommendations: Learning
from the Past to Try not to Rediscover the Wheel Again . . . . . . . . . . . . 182
Sónia Rolland Sobral
Contents xix
On the Role of Python in Programming-Related Courses

for Computer Science and Engineering Academic Education . . . . . . . . . 192
Costin Bădică, Amelia Bădică, Mirjana Ivanović, Ionuţ Dorinel Murareţu,
Daniela Popescu, and Cristinel Ungureanu
Validating the Shared Understanding Construction in Computer
Supported Collaborative Work in a Problem-Solving Activity . . . . . . . . 203
Vanessa Agredo-Delgado, Pablo H. Ruiz, Alicia Mon, Cesar A. Collazos,
Fernando Moreira, and Habib M. Fardoun
Improving Synchrony in Small Group Asynchronous
Online Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Samuli Laato and Mari Murtonen
Academic Dishonesty Prevention in E-learning University System . . . . . 225
Daria Bylieva, Victoria Lobatyuk, Sergei Tolpygin, and Anna Rubtsova
Curriculum for Digital Culture at ITMO University . . . . . . . . . . . . . . . 235
Elena Mikhailova, Anton Boitsev, Olga Egorova, Natalia Grafeeva,
Aleksei Romanov, and Dmitriy Volchek
ICT Impact in Orientation and University Tutoring According
to Students Opinion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
Antonio Pantoja Vallejo, Beatriz Berrios Aguayo,
and María Jesús Yolanda Colmenero Ruiz
Blockchain Security and Privacy in Education:
A Systematic Mapping Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
Attari Nabil, Khalid Nafil, and Fouad Mounir
The Development of Pre-service Teacher’s Reflection Skills
Through Video-Based Classroom Observation . . . . . . . . . . . . . . . . . . . . 263
Ana R. Luís
Formative Assessment and Digital Tools in a School Context . . . . . . . . 271
Sandra Paiva, Luís Paulo Reis, and Lia Raquel
Information Technologies in Radiocommunications

Compact Slotted Planar Inverted-F Antenna: Design Principle
and Preliminary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
Sandra Costanzo and Adil Masoud Qureshi
Technologies for Biomedical Applications

Statistical Analysis to Control Foot Temperature
for Diabetic People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
José Torreblanca González, Alfonso Martínez Nova, A. H. Encinas,
Jesús Martín-Vaquero, and A. Queiruga-Dios
xx Contents
Sensitive Mannequin for Practicing the Locomotor Apparatus

Recovery Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
Cosmin Strilețchi and Ionuț Dan Cădar
Pervasive Information Systems

Data Intelligence Using PDME for Predicting Cardiovascular
Predictive Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
Francisco Freitas, Rui Peixoto, Carlos Filipe Portela, and Manuel Santos
Design of a Microservices Chaining Gamification Framework . . . . . . . . 327
Ricardo Queirós
PWA and Pervasive Information System – A New Era . . . . . . . . . . . . . 334
Gisela Fernandes, Filipe Portela, and Manuel Filipe Santos
Inclusive Education through ICT

Young People Participation in the Digital Society:
A Case Study in Brazil . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
Everton Knihs and Alicia García-Holgado
Blockchain Technology to Support Smart Learning and Inclusion:
Pre-service Teachers and Software Developers Viewpoints . . . . . . . . . . 357
Solomon Sunday Oyelere, Umar Bin Qushem, Vladimir Costas Jauregui,
Özgür Yaşar Akyar, Łukasz Tomczyk, Gloria Sanchez, Darwin Munoz,
and Regina Motz
Digital Storytelling in Teacher Education for Inclusion . . . . . . . . . . . . . 367
Özgür Yaşar Akyar, Gıyasettin Demirhan, Solomon Sunday Oyelere,
Marcelo Flores, and Vladimir Costas Jauregui
In Search of Active Life Through Digital Storytelling: Inclusion
in Theory and Practice for the Physical Education Teachers . . . . . . . . . 377
Burcu Şimşek and Özgür Yaşar Akyar
Accessibility Recommendations for Open Educational Resources
for People with Learning Disabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Valéria Farinazzo Martins, Cibelle Amato, Łukasz Tomczyk,
Solomon Sunday Oyelere, Maria Amelia Eliseo, and Ismar Frango Silveira
Digital Storytelling and Blockchain as Pedagogy and Technology
to Support the Development of an Inclusive Smart
Learning Ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Solomon Sunday Oyelere, Ismar Frango Silveira,
Valeria Farinazzo Martins, Maria Amelia Eliseo, Özgür Yaşar Akyar,
Vladimir Costas Jauregui, Bernardo Caussin, Regina Motz,
Jarkko Suhonen, and Łukasz Tomczyk
Contents xxi
Aggregation Bias: A Proposal to Raise Awareness Regarding

Inclusion in Visual Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Andrea Vázquez-Ingelmo, Francisco J. García-Peñalvo,
and Roberto Therón
A Concrete Action Towards Inclusive Education:
An Implementation of Marrakesh Treaty . . . . . . . . . . . . . . . . . . . . . . . . 418
Virginia Rodés and Regina Motz
Intelligent Systems and Machines

Cloud Computing Customer Communication Center . . . . . . . . . . . . . . . 429
George Suciu, Romulus Chevereșan, Svetlana Segărceanu, Ioana Petre,
Andrei Scheianu, and Cristiana Istrate
International Workshop on Healthcare Information Systems

Interoperability, Security and Efficiency
A Study on CNN Architectures for Chest X-Rays Multiclass
Computer-Aided Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Ana Ramos and Victor Alves
A Thermodynamic Assessment of the Cyber Security Risk
in Healthcare Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
Filipe Fernandes, Victor Alves, Joana Machado, Filipe Miranda,
Dinis Vicente, Jorge Ribeiro, Henrique Vicente, and José Neves
How to Assess the Acceptance of an Electronic
Health Record System? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466
Catarina Fernandes, Filipe Portela, Manuel Filipe Santos, José Machado,
and António Abelha
An Exploratory Study of a NoSQL Database for a Clinical
Data Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476
Francini Hak, Tiago Guimarães, António Abelha, and Manuel Santos
Clinical Decision Support Using Open Data . . . . . . . . . . . . . . . . . . . . . . 484
Francini Hak, Tiago Guimarães, António Abelha, and Manuel Santos
Spatial Normalization of MRI Brain Studies Using a U-Net
Based Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
Tiago Jesus, Ricardo Magalhães, and Victor Alves
Business Analytics for Social Healthcare Institution . . . . . . . . . . . . . . . . 503
Miguel Quintal, Tiago Guimarães, Antonio Abelha,
and Manuel Filipe Santos
Step Towards Monitoring Intelligent Agents in Healthcare
Information Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
Regina Sousa, Diana Ferreira, António Abelha, and José Machado
xxii Contents
Network Modeling, Learning and Analysis

A Comparative Study of Representation Learning Techniques
for Dynamic Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
Carlos Ortega Vázquez, Sandra Mitrović, Jochen De Weerdt,
and Seppe vanden Broucke
Metadata Action Network Model for Cloud Based
Development Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
Mehmet N. Aydin, Ziya N. Perdahci, I. Safak,
and J. (Jos) van Hillegersberg
Clustering Foursquare Mobility Networks to Explore Urban Spaces . . . 544
Olivera Novović, Nastasija Grujić, Sanja Brdar, Miro Govedarica,
and Vladimir Crnojević
Innovative Technologies Applied to Rural Regions

The Influence of Digital Marketing Tools Perceived Usefulness
in a Rural Region Destination Image . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
Filipa Jorge, Mário Sérgio Teixeira, and Ramiro Gonçalves
Ñawi Project: Visual Health for Improvement of Education
in High Andean Educational Communities in Perú . . . . . . . . . . . . . . . . 570
Xavi Canaleta, Eva Villegas, David Fonseca, Rafel Zaragoza,
Guillem Villa, David Badia, and Emiliano Labrador
Building Smart Rural Regions: Challenges and Opportunities . . . . . . . . 579
Carlos R. Cunha, João Pedro Gomes, Joana Fernandes,
and Elisabete Paulo Morais
The Power of Digitalization: The Netflix Story . . . . . . . . . . . . . . . . . . . . 590
Manuel Au-Yong-Oliveira, Miguel Marinheiro, and João A. Costa Tavares
An Online Sales System to Be Managed by People
with Mental Illness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600
Alicia García-Holgado, Samuel Marcos-Pablos,
and Francisco J. García-Peñalvo
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613

Health Informatics
A Product and Service Concept Proposal
to Improve the Monitoring of Citizens’
Health in Society at Large
Luís Fonseca1, João Barroso1, Miguel Araújo1, Rui Frazão1,

and Manuel Au-Yong-Oliveira2(&)
1
Department of Electronics, Telecommunications and Informatics,
University of Aveiro, Aveiro, Portugal
{luismiguel.fonseca,joao.barroso,mdaraujo,
ruifilipefrazao}@ua.pt
2
GOVCOPP, Department of Economics, Management,
Industrial Engineering and Tourism, University of Aveiro, Aveiro, Portugal
[email protected]
Abstract. Nowadays wearable devices are very popular. The reason for that is
the sudden reduction in pricing and the increase in functionalities. Healthcare
services have been greatly benefiting from the emergence of these devices since
they can collect vital signs and help healthcare professionals to easily monitor
patients. Medical wellness, prevention, diagnosis, treatment and monitoring
services are the main focus of Healthcare applications. Some companies have
already invested in this market and we present some of them and their strategies.
Furthermore, we also conducted a group interview with Altice Labs in order to
better understand the critical points and challenges they encountered while
developing and maintaining their service. With the purpose of comprehending
users’ receptiveness to mHealth systems (mobile health systems which users
wear - wearables) and their opinion about sharing data, we also created a
questionnaire (which had 114 valid responses). Based on the research done we
propose a different approach. In our product and service concept solution, which
we share herein, we consider people of all ages to be targets for the
product/service and, beyond that, we consider the use of machine learning
techniques to extract knowledge from the information gathered. Finally, we
discuss the advantages and drawbacks of this kind of system, showing our
critical point of view.
Keywords: Healthcare mHealth Wearable technology Biomedical

monitoring Health tracking
1 Introduction
The healthcare system is intended to efficiently provide healthcare services so as to

meet the health needs and demands of individuals, families and the community. In the
last few years, technology and healthcare have been combined to improve the quality of
life of the population around the world. The interest in the mixing of these fields is to
Á. Rocha et al. (Eds.): WorldCIST 2020, AISC 1161, pp. 3–14, 2020.
https://doi.org/10.1007/978-3-030-45697-9_1
4 L. Fonseca et al.
understandingly be capable of improving healthcare systems by taking advantage of the

ubiquitous and powerful mobile devices that everyone carries with them, on a daily
basis, that are able to gather information during users’ daily activities and correlate it
with their health status.
The term mHealth (mobile Health) comes with the development and use of less
intrusive and more comfortable mobile devices, like smartphones, wearable devices
incorporated in clothes, bracelets, necklaces, watches, or many others, or sensors and
applications (apps) that track users’ physiological information. This concept is capable
of revolutionizing the healthcare service delivery by preventing and diagnosing early-
stage medical conditions, leading to an increase in its quality and efficiency. The World
Health Organization (WHO) states that mHealth systems in high-income countries
allow for the reducing of healthcare costs whereas developing countries tend to provide
access to primary healthcare services [1].
With the rising concern about healthcare and mHealth topics and the increase in the
amount of data gathered, computational techniques started being used to process the
data to extract information and knowledge from it and try to improve healthcare
systems and services provided to its users. Since 2017, the Machine Learning for
Healthcare annual research meeting has as its objective to bring together two usually
insular disciplines: computer science and medical research [2].
The contribution that this paper offers is the analysis of some already existing
mHealth systems and understanding the contributing/decisive factors, concerns and
different approaches to future systems.
Based on research about wearable technology devices and on the market, and on
organizations which act towards the improvement of healthcare systems and lifestyle, it
is easily noticed that most products are developed to operate in a more specific field,
like the tracking of physical activity or biometrics measurement, instead of cross-
referencing data from other biometric markers or even other individuals.
To clarify some concepts and in order to understand the strategic point of view of a
mHealth services provider and how their product contributes to its users’ lifestyles, a
group interview with Altice Labs (in the Telecom industry), an Altice branch in Aveiro,
Portugal, was performed to talk about their solution, SmartAL [3]. During the inter-
view, aspects like their system’s features and the possible applications of machine
learning techniques on the gathered data to predict medical conditions were discussed.
After that a questionnaire was created and administered with the objective of under-
standing people’s knowledge on monitoring devices and their opinions about mHealth
systems and the sharing of data.
Finally, the conclusions taken from the analyzed solutions are presented and a
newly proposed system is described with regards to the related work as well as to the
benefits and drawbacks it may bring. A positive impact on the community may well be
the result.
A Product and Service Concept Proposal 5
2 Literature Review
In this section we shall describe what has already been researched about wearable
devices and medical data processing and we shall explain some mHealth systems
implemented by enterprises or organizations that are similar to our proposed system.
The literature review will be the basis of the development of a possible solution to
improve the quality of life of the population while at the same time reducing healthcare
costs.
2.1 Wearable Devices

Wearables are electronic technologies or devices that are incorporated into items of
clothing and personal accessories. This technology has seen significant growth in the
last few years [4].
Comparatively to smartphones, these devices have a great advantage because they
are used close to the human body, so they can read biometric data more accurately and
beyond that can read a greater diversity of biometrical signs. In some daily activities,
such as sleeping, as well as during a lot of physical exercise, people do not usually have
their smartphones close by, so they cannot continuously read the users’ signs, unlike
what occurs with wearables that are always in contact with the user during every task
throughout the day [5].
Actually, wearables can read a great diversity of biometrical signs including: heart-
rate, motion (acceleration and direction), earth magnet field, ambient light, steps,
distance, calories burned, active minutes, hourly activity and stationary time, sleep
monitoring, movement, muscle fatigue, joints’ pressure, heart rate, breathing, tem-
perature, light, falls, fatigue, among others, and users can be warned if some irregular
markers occur [6].
According to Berglund et al.’s research [7], most of this technology is applied in
sports/fitness and lifestyle/fashion. In this same article it is mentioned that technology
is incorporated mainly in watches, jewelry and shirts [7].
Over time this technology has become more popular and the main reason for that is
comfort, as they have become more sophisticated and more accessible [8].
Some researchers believe that in the future doctors will prescribe a treatment that
includes both medicine and devices such as wearables [9].
However, despite the increased popularity of these devices, Kolasinska et al. refer
that few people use smartwatches; albeit, if they were to use sensors they prefer that
they would to be incorporated in smartwatches or clothes [10]. An example of wearable
clothes are the “MagIC System” that is a prototype of a t-shirt with sensors [11] and a
smart bra with a mechanism to detect early signs of breast cancer through thermody-
namic sensors [12].
2.2 Data Processing and Analytics

Data produced by monitoring devices can be stored in local devices, but normally is
saved in the cloud in order to be easily accessed [13]. That is important for some
6 L. Fonseca et al.
applications so that medical staff, trusted relatives and even the patient might access
that data.
These devices provide high-density data [14] (e.g., 10–500 times per second) which
can be processed using algorithms emerging in the machine learning field.
Machine learning is one of the many areas of artificial intelligence oriented towards
the study, processing and analysis of large amounts of data with the objective of
developing computational models capable of automatic learning [15]. These models are
able to detect relationships in the data, that would be difficult for humans to perceive.
Algorithms that make classification decisions have a tremendous dependency on
the quantity of the data needed in order to learn, so there is an opportunity to use data
from mHealth devices.
It is possible to draw conclusions about individuals such as type of physical
activity, level of stress, or intensity of pain [14]. For physical activity classifications K-
nearest neighbors (KNN) have been used as well as Bayes techniques from either single
accelerometer or multiple types of sensors; Artificial neural network (ANN) and
decision tree modelling are used to recognize these activities, fusing data from
accelerometers and GPS [16]. As fall detection is a major concern for elderly people,
support vector machines (SVM) have been studied to detect those events, as well as
gesture classification [16].
By using different data sources, it is possible to improve the validity of the esti-
mates with data fusion. For example, the accelerometer information can improve the
interpretation of the raw electrocardiogram (ECG) data while a person is exercising,
because the ECG signal is strongly affected. Improving the interpretation of data,
combining different signs, can reduce incorrect clinical evaluations that can lead to
false alarms. In practice, fusion can be really challenging because the spatial and
temporal resolutions of different data sources can be different [14].
2.3 Enterprises and Services

Nowadays, it is possible to find solutions, mostly for personal use only or in an
organizational context. With the appearance and rising in popularity of the concept of
mHealth, some enterprises began to invest in services within this concept and in this
section we shall present some of them.
Systems such as smartwatches can help users by informing them about their health
status during daily activities such as counting steps, monitoring sleep, among many
others.
Apple has invested in the development of the Apple Watch aiming for personal use.
This device offers the possibility to take an electrocardiogram (ECG) [17] and
specifically for women, makes it easy to log information about their menstrual cycle
[18].
Garmin also developed a fitness tracker capable of preventing the risk of injury.
The tracker issues warnings when the risk of injury is high and gives an estimate of
how long the user should rest [19].
In existing markets it is possible to find a great diversity of these devices from the
cheapest, such as smartbands, to more expensive ones, such as powerful smartwatches.
In response to a specific use case, in this case diabetes, Medtronic created Mini-
Med, a device with the ability to automatically adjust basal insulin based on patient’s
CGM readings. It also keeps record of the last 90 days of pump history and generates
reports [20].
In an organizational context solutions are starting to be implemented, for example
in hospitals, clinics and private entities. An example of that occurred from 2009 to
2011, in London’s Chelsea and Westminster Hospital. They invested in an e-Health
pilot project that consists in recording and storing patients’ activities on a platform so
that doctors can easily access users’ data to analyze it and reach conclusions [13].
Altice developed a pilot project where the intended users are the elderly. Smart
Assisted Living (SmartAL) is a social and health support solution that includes the
telemonitoring of vital signs, such as weight, blood pressure, pulse rate, accessible via
TV with an interactive IPTV service, Android app or web browser. This system makes
it possible to configure threshold values of biometric data to emit alerts to healthcare
professionals, family and friends [3].
On the one hand, most of these kinds of organization take advantage of the
increasing ease of access to wearable devices, mostly for fitness and lifestyle purposes.
On the other hand, in many cases the integration of the elderly in this type of systems
has been a common concern.
3 Methodology
Based on the literature review section, the strategies of some organizations that develop
mhealth systems were analyzed. A group interview with Altice Labs was scheduled in
order to gain more knowledge about this kind of system.
Before the group interview with the company we created an interview script with
the aim to understand what were the main challenges that they encountered while
developing the product, which were its weaknesses and also if the product had a margin
of progress with the improvement of technology. In other words, we intended to do a
high-level SWOT analysis of their Assisted Living product. The group interview was
performed on the 17th of October 2019, for around thirty minutes, in Aveiro, using an
interview script, and involved two employees of the firm (both from the product
development department).
Notes were taken on the topics discussed. Authorization for the use of the material
discussed was a topic of the small group interview (at the end). Some material was
deemed as not being able to be used by the research group. The interview led to
important conclusions on the product concept developed during this research study.
During the group interview, we discussed aspects such as their system’s features,
people’s receptivity to it and other issues duly uncovered. Further to that, other
approaches and scenarios where their system could be integrated were discussed with
the objective of contributing to a better solution.
In an ideal system, it is essential that the system be capable of answering all
people’s needs.
We also performed a questionnaire with the purpose of perceiving citizens’
knowledge of this kind of device, their opinion about sharing personal data and their
8 L. Fonseca et al.
receptiveness to this kind of system. Thus, what we really wanted to understand with
this questionnaire was also if the younger generation was receptive to use a mHealth
product, thus expanding the market and increasing the target audience. In order to reach
a group of heterogeneous participants concerning age level, we distributed the ques-
tionnaire at a high school to receive feedback from younger people, in a nursing home
to get responses from older people, while we also visited a company to get responses
from middle-aged people.
In the survey, 114 people agreed to participate, whereby 51.8% were male and
48.2% female. With regards to the age group 14.9% were up to 17 years old, 40.4%
were between 18 and 35 years old, 28.9% were between 36 and 60 years old and the
remaining 15.8% were more than 60 years old. As concerns academic qualifications,
14.9% had attended primary school, 3.5% had attended the 2nd cycle of school, 2.6%
had attended the 3rd cycle of school, 24.6% had been to high school, 27.2% had a
licentiate degree, 26.3% had a Master’s degree and the remaining 0.9% had a PhD
degree. It was important to discriminate the age of the participants as well as their
academic qualifications in order to understand if they had an influence on their choices.
The questionnaire contained the following questions:
• Do you know any kind of device for measuring vital signs (example: bracelets and
smart watches, chest bands, etc.)?
• Would you be willing to use one of these devices to monitor your vital signs?
• Which biometric signs are you most interested in monitoring?
• Would you be willing to send your data to an outside entity and thereby benefit
from a closer monitoring of your health?
• Are you aware of the General Data Protection Regulation (GDPR)?
• Would you be willing to pay for a service that uses the information of your vital
signs, and thus enjoy a closer monitoring of your health?
4 Results
In this section, we shall present the conclusions obtained from the group interview with
Altice Labs as well as the results from the questionnaire.
4.1 SmartAL – Software Platform Developed by Altice Labs

for Monitoring the Elderly
As referred to above, SmartAL’s focus is the monitorization of the elderlies’ vital signs
while raising alerts to family, friends and healthcare professionals. For that it is nec-
essary to define threshold values for each vital signal. Another functionality of this
system is to allow healthcare professionals and relatives to remotely access user data.
In this application each user is individually analyzed and there is no direct com-
parison between the user and groups of similar users.
According to Altice Labs’ research, they claim vital signs are not enough to predict
medical conditions. For a precise diagnostic it is mandatory to have complementary
exams, such as blood analyses and sonographies, among others.
Another point mentioned was that medical staff are usually not in favour of giving a
computer system the possibility of diagnosis, as they believe that the human factor is
extremely important to infer the final decision based on data semantics.
One of the main points that we gathered concerning their solution is that it is
necessary to manually insert users’ vital signs on the SmartAL platform. We believe
that a significative improvement would be an automatic collection of these signs using
mHealth devices.
4.2 Survey
In this section we shall present the conclusions after the questionnaire referred to above
(114 responses).
Considering the global results obtained, it was found that:
• Around 75% of the participants know about wearable devices;
• Although 25% do not know about wearable devices, only 16.7% would not use
them, i.e., 83.3% would accept to use these devices for monitoring purposes;
• 70.2% would be willing to share their data;
• 72.8% of the participants have knowledge about laws on data protection and pri-
vacy (GDPR);
• Only 21.1% would pay and around 49% of all participants are undecided;
• In general, people are mostly preoccupied in measuring their heart rate (74.6%) and
their stress levels (64.9%).
Analyzing each age group, the following was observed:
• There is a tendency for the older generation to know less about the monitoring
devices (see Fig. 1);
• Although the elderly do not know about these devices, they would be willing to use
them. In the other younger age groups people would generally use them (85%–
95%);
• Few young and elderly people know about GDPR (less than 50%), however people
between 18 and 60 years are aware of it (around 85%);
• In general, those who know about GDPR would not be comfortable sharing their
data; on the contrary, those who do not know about data privacy regulations would
not mind sharing;
• No one under 17 years of age answered they would pay for the service. The interval
of people from 18–60 years of age had a similar relative percentage, where 30%
said they would not pay, while 20% would, and the remaining 50% are undecided.
In the elderly group, only 20% are undecided and 50% of them would definitely be
willing to pay.
10 L. Fonseca et al.
Fig. 1. Knowledge about wearable devices
5 Discussion of a New Possible Solution for the General

Population and for All Age Groups
In accordance with the questionnaire results, the literature review and the group
interview with Altice Labs, we present a possible solution.
According to the questionnaire results, it will be interesting to apply this kind of
system to the whole population, as all age groups would be willing to use wearable
devices for monitoring and not only elderly people. Although most systems are mainly
focused on elderly people, we noticed that most of this group answered that they are
not aware of this kind of device.
It is crucial that users have some wearables that are continuously sending their
biomedical signs such as heart rate (pulse), stress levels, body temperature, among
others, to a platform data management system hosted in the cloud. The use of wearable
devices during long periods of time produces a huge volume of data. That data can be
used by algorithms to detect long term patterns and notice when the pattern changes,
raising alerts to healthcare entities.
To improve diagnostics and the quality of the data, all exams done by patients such
as blood collection, ultrasounds, among others, should be stored.
Hospital entity staff have the responsibility to analyze the received alert and to
make a decision. If they consider a possible problem, they should notify the patient to
come to the hospital entity to do more precise exams and consequently verify if, in fact,
something is wrong with the patient.
Briefly, the aim of this system is to transform data into information and that
information into knowledge which may be used as a support decision system by the
medical community.
The ecosystem explained before is illustrated in Fig. 2.
Fig. 2. Conceptual architecture of a healthcare system
Regarding the availability to pay for this kind of service, around 49% of all par-
ticipants are undecided, and 30% will not pay, so we believe it would be necessary to
have more concrete features that would potentially improve quality of life, for these
people to change their position.
This kind of system can have several drawbacks. As stated before, the collected
data is stored in a cloud platform, which normally replicates the data, to better protect
it, which can lead to considerable expenses, because it is necessary to have a big
infrastructure to store this data. Due to the sensitivity associated with this kind of data,
it is not possible to use existing cloud service providers, so it would be necessary to
build one from scratch.
The use of that data for other purposes is also a major concern because some
companies could take advantage of this information, for example by selling it or
customizing product advertisements. We further noticed that participants who know
about GDPR tend not to feel comfortable about sharing their data. Despite seeming
contradictory, this can be explained by their knowledge about related risks.
Despite the disadvantages, the information could also be relevant for research
purposes while considering the fact that participants must be anonymized.
6 Conclusion
In this paper, we start with a researching of the literature about devices, data processing
and existing healthcare systems. We had the opportunity to perform a group interview
with Altice Labs which gave us insights about the challenges that they had before such
a prediction of medical conditions with only biometrical data was possible. In their
perspective, medical staff would not accept the possibility of full prediction by the
system and, on the other hand, they defend that these systems should only be for
decision support. Furthermore, from the questionnaire, we can conclude that most
people would be willing to use healthcare systems. Based on these results, we dis-
cussed and proposed an approach that would be able to solve some of the existing
issues, although we also state that some disadvantages are not easily overcome.
In conclusion, we note that there exist a lot of solutions related to healthcare,
especially for the elderly. However, there are still some problems, such as the elderlies’
unawareness of wearable devices and services related to them.
According to our research, we perceive that this solution could be applied to people
of all ages, taking advantage of the popularity of wearables since they are cheaper and
more robust than ever before. In general, people are receptive to this kind of system,
but it is evident that there are concerns about privacy issues.
In addition to the advantages for citizens, we believe that countries’ health systems
would reduce costs and improve health service qualities. Furthermore, the scientific
community may benefit from this solution using data gathered for researchers in var-
ious areas.
7 Contribution and Suggestions for Future Research
The solution set forth is an incremental contribution, in terms of the innovation

involved – and when comparing our solution to what already exists in the market. This
is due to the fact that it adds an analytical component to the data on society in general
(instead of being an individual solution) and the focus commences to be on all people,
instead of only being on the elderly.
In order for the solution to be a radical innovation it would have to have, besides
the monitoring of people, some type of reaction, as a consequence to a malady
detected. In sum, if something bad was to happen to someone (for example, if someone
fainted, or had an epileptic seizure) he or she would be helped automatically – by a
drone or by a system alerting the health services, which would then act accordingly. It
would thus be interesting to study what type of automatic reactions could occur with
existing technology, while also thinking of new systems to be put in place.
Regarding the Professional Performance Framework, designed by the Medical
Board of Australia, our solution goes according to the initiatives stated therein, such as:
Guidance to support practitioners - regularly updated professional standards that sup-
port good medical practice and collaborations to foster a culture of medicine that is
focused on patient safety [21].
Acknowledgements. We would like to thank Telma Mota and Ricardo Machado from Altice
Labs for having agreed to be interviewed and for all the information provided during the group
interview. For the dissemination of the questionnaire, we would like to thank Patrícia Gonçalves
and Graça Ferraz, from the Recesinhos Social Center, for having helped in the gathering of data
from the elderly; and Ana Araújo for having helped in the gathering of data from the younger age
groups.
References
1. World Health Organization: mHealth: New horizons for health through mobile technologies.
Observatory 3, 66–71 (2011). http://www.webcitation.org/63mBxLED9
2. Machine Learning for Healthcare. https://www.mlforhc.org/. Accessed 19 Oct 2019
3. AlticeLabs: SmartAL – Smart Assisted Living. http://www.alticelabs.com/site/smartal/.
Accessed 16 Oct 2019
4. Wearable Devices: Wearable Technology and Wearable Devices: Everything You Need to
Know. http://www.wearabledevices.com/what-is-a-wearable-device/. Accessed 17 Oct 2019
5. Hänsel, K.: Wearable sensing approaches for stress recognition in everyday life. In:
Proceedings of the 2017 Workshop on MobiSys 2017 Ph.D. Forum - Ph.D. Forum 2017,
pp. 1–2 (2017). http://dl.acm.org/citation.cfm?doid=3086467.3086470. Accessed 19 Oct
2019
6. Qiu, H., Wang, X., Xie, F.: A survey on smart wearables in the application of fitness. In:
2017 IEEE 15th International Conference on Dependable, Autonomic and Secure
Computing, 15th International Conference on Pervasive Intelligence and Computing, 3rd
International Conference on Big Data Intelligence and Computing and Cyber Science and
Technology Congress (DASC/PiCom/DataCom/CyberSciTech), pp. 303–307 (2017). http://
ieeexplore.ieee.org/document/8328407/. Accessed 19 Oct 2019
7. Berglund, M.E., Duvall, J., Dunne, L.E.: A survey of the historical scope and current trends
of wearable technology applications. In: Proceedings of the 2016 ACM International
Symposium on Wearable Computers - ISWC 2016, pp. 40–43 (2016). http://dl.acm.org/
citation.cfm?doid=2971763.2971796. Accessed 19 Oct 2019
8. Sultan, N.: Reflective thoughts on the potential and challenges of wearable technology for
healthcare provision and medical education. Int. J. Inf. Manag. 35(5), 521–526 (2015).
https://www.sciencedirect.com/science/article/pii/S0268401215000468. Accessed 19 Oct
2019
9. Fletcher, R.R., Poh, M.-Z., Eydgahi, H.: Wearable sensors: opportunities and challenges for
low-cost health care. In: 2010 Annual International Conference of the IEEE Engineering in
Medicine and Biology, pp. 1763–1766 (2010). http://ieeexplore.ieee.org/document/5626734/.
10. Kolasinska, A., Quadrio, G., Gaggi, O., Palazzi, C.E.: Technology and aging: users’
preferences in wearable sensor networks. In: Proceedings of the 4th EAI International
Conference on Smart Objects and Technologies for Social Good - Goodtechs 2018, pp. 77–
81 (2018). http://dl.acm.org/citation.cfm?doid=3284869.3284884. Accessed 19 Oct 2019
11. Di Rienzo, M., Rizzo, F., Parati, G., Brambilla, G., Ferratini, M., Castiglioni, P.: MagIC
system: a new textile-based wearable device for biological signal monitoring. Applicability
in daily life and clinical setting. In: Annual International Conference of the IEEE
Engineering in Medicine and Biology - Proceedings, vol. 7, pp. 7167–7169 (2005)
12. Innovatemedtec: This smart bra can detect breast cancer much earlier than existing screening
tests - innovatemedtec content library. https://innovatemedtec.com/content/smart-bra?fbclid=
IwAR1IKHXd7ih9JnwVDQvL6wkUCpltayE-nsIHkv_9zADrYXhyI9bzKz56D5Q. Acces-
sed 19 Oct 2019
13. Sultan, N.: Making use of cloud computing for healthcare provision: opportunities and
challenges. Int. J. Inf. Manag. 34(2), 177–184 (2014). https://www.sciencedirect.com/
science/article/pii/S0268401213001680. Accessed 19 Oct 2019
14. Kumar, S., et al.: Mobile health technology evaluation: the mHealth evidence workshop. Am.
J. Prev. Med. 45(2), 228–236 (2013). https://www.sciencedirect.com/science/article/pii/
S0749379713002778. Accessed 13 Oct 2019
15. Géron, A.: Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools,
and techniques to build intelligent systems. https://www.oreilly.com/library/view/hands-on-
machine-learning/9781492032632/. Accessed 15 Oct 2019
16. Qi, J., Yang, P., Newcombe, L., Peng, X., Yang, Y., Zhao, Z.: An overview of data fusion
techniques for Internet of Things enabled physical activity recognition and measure. Inf.
Fusion 55, 269–280 (2020). https://www.sciencedirect.com/science/article/pii/S1566253
519302258?via%3Dihub. Accessed 05 Oct 2019
17. Apple: Efetuar um ECG com a app ECG no Apple Watch Series 4ou posterior - Suporte
Apple. https://support.apple.com/pt-pt/HT208955. Accessed 14 Oct 2019
18. Apple: Apple Watch Series 5 - Apple. https://www.apple.com/apple-watch-series-5/.
19. IMTInnovation: IMT Innovation Digital Health Incubator Wearable Technology to
Minimise Injury Risk. https://imtinnovation.com/2018/11/10/wearable-technology-to-mini
mise-injury-risk/. Accessed 19 Oct 2019
20. Medtronic: MiniMed 670G Insulin Pump System—Medtronic Diabetes. https://www.
medtronicdiabetes.com/products/minimed-670g-insulin-pump-system. Accessed 19 Oct
2019
21. Medical Board of Australia: Building a professional performance framework. https://www.
racgp.org.au/getmedia/d810a609-c344-4e97-b615-1f83b6e504eb/Medical-Board-Report-Bu
ilding-a-professional-performance-framework.PDF.aspx. Accessed 04 Jan 2020
Artificial Neural Networks Interpretation
Using LIME for Breast Cancer Diagnosis
Hajar Hakkoum1 , Ali Idri1,2(B) , and Ibtissam Abnane1

1
Software Project Management Research Team, ENSIAS,
Mohammed V University, Rabat, Morocco
[email protected]
2
Complex Systems Engineering and Human Systems,
Mohammed VI Polytechnic University, Ben Guerir, Morocco
Abstract. Breast Cancer (BC) is the most common type of cancer

among women. Thankfully early detection and treatment improvements
helped decrease its number of deaths. Data Mining techniques (DM),
which discover hidden and potentially useful patterns from data, partic-
ularly for breast cancer diagnosis, are witnessing a new era, where the
main objective is no longer replacing humans or just assisting them in
their tasks but enhancing and augmenting their capabilities and this is
where interpretability comes into play. This paper aims to investigate
the Local Interpretable Model-agnostic Explanations (LIME) technique
to interpret a Multilayer perceptron (MLP) trained on the Wisconsin
Original Data-set. The results show that LIME explanations are a sort
of real-time interpretation that helps understanding how the constructed
neural network “thinks” and thus can increase trust and help oncologists,
as the domain experts, learn new patterns.
Keywords: Interpretability · Breast Cancer · Diagnosis · LIME
1 Introduction
Breast cancer is a phenotypically diverse population of breast cancer cells [1].

It is affecting about 1 out of 8 women over the world and is considered as the
leading cause of cancer death among women at the age between 40 and 59 [2].
Its causes are not yet fully known, although age, genetic risk, smoking, eating
unhealthy food, overweight, late menopause, and late age at first childbirth were
identified as risk factors [3].
Several Data Mining techniques, whether relying on artificial intelligence or
statistics, were used to help discover new patterns and new high-level information
from historical databases of patients that had Breast Cancer [3–6]. It is thus a
powerful tool to analyze and deal with BC challenges, capable of reducing the
deaths rate caused by this disease [3].
c The Editor(s) (if applicable) and The Author(s), under exclusive license
Á. Rocha et al. (Eds.): WorldCIST 2020, AISC 1161, pp. 15–24, 2020.
16 H. Hakkoum et al.
A systematic mapping study by Idri et al. [3] was conducted on 403 arti-
cles treating DM techniques in BC showed that most of the articles focused on
the diagnosis task using DM techniques with 78.63%, 7.63%, 9.16% and 4.58%
for classification, regression, clustering, and association respectively. They also
showed that the use of black-box models such as Support Vector Machines or
Neural Nets was very high followed by Decision Trees, which are explainable
models and highlighted that the use of Neural Nets has decreased over time
which is, very probably, due to their inexplicable behavior [3].
Interpretability is one of the most common reasons limiting artificial neu-
ral networks, and black-box in general, to be accepted and used in critical
domains such as the medical one. Indeed, healthcare offers more challenges to
Machine learning (ML) by being more demanding for interpretability [7]. Model
interpretability is thus often chosen over its accuracy. Therefore, understanding
black-box models can help assist and augment the provision of better care while
doctors remain integral to their role. It could also improve human performance,
extract insights, gain new knowledge about the disease which may be used to
generate hypotheses [7].
Skater is an Oracle unified framework for Model Interpretation [8] that used
LIME explanations to interpret a basic MLP, with 100 hidden nodes, trained on
the Wisconsin (Diagnostic) Database that has 30 attributes. This paper aims
to apply and evaluate the local interpretability technique LIME on an MLP.
The main contribution is the LIME interpretation of the best of two neural
networks: a basic MLP and a Deep MLP (four layers) trained on the Breast
Cancer Wisconsin (Original) data-set that has 9 attributes.
The rest of this paper is structured as follows: Sect. 2 presents some important
concepts related to this paper. Section 3 presents some related work dealing with
the use of interpretation techniques. Section 4 describes the database as well as
the performance measures used to select the best performing model. Section 5
presents the experimental design followed in this empirical evaluation. Section 6
discusses the obtained results. The threats to the validity of this paper are given
in Sect. 7. Section 8 presents conclusions and future work.
2 Background
This section presents an overview of the feed-forward neural networks that were
constructed and evaluated in our experiments. As well as the interpretability
techniques that were applied to their best variants.
2.1 Artificial Neural Networks: MLP
Artificial neural networks are a set of algorithms designed to mimic the brain-
behavior [9]. There are multiple types of neural networks, the most basic type
is the feed-forward where information travels in one direction from input to
output. A popular example of this type is MLP, it is composed of an input layer
to receive the signal, an output layer that makes a decision, and in between, an
ANN Interpretation Using LIME For BC Diagnosis 17
arbitrary number of hidden layers. Each layer is a set of neurons interconnected

with the other layers by weights. It represents a non-linear mapping between an
input vector and an output vector [10].
In the forward pass, the information travels from the input layer through the
hidden layers to the output layer, and the decision of the output layer is compared
against the ground truth labels. In the backward pass, using backpropagation,
partial derivatives of the error function, the various weights and biases are back-
propagated through the MLP [11]. That act of differentiation gives us a gradient,
or a landscape of error, along which the parameters may be adjusted as they
move the MLP one step closer to the error minimum.
2.2 Local Interpretability

Interpretability can be defined as the degree to which a human can understand
the cause of a decision [12] or the degree to which a human can consistently pre-
dict the model’s result [13]. The more faithful and intelligible a model’s explain-
ability is, the easier it is for someone to trust it and comprehend why certain
decisions or predictions have been made [14]. For instance, doctors can not trust
blindly a black-box prediction, not understanding how it provides the results.
There are two options to make ML interpretable [15]: Using interpretable
models (white-boxes), such as linear models or decision trees and here data sci-
entists are faced with the accuracy-interpretability trade-off, or using interpreta-
tion tools. Such tools can either explain model behavior across all data instances
or individual predictions. Thus two types of interpretability techniques can be
identified: Global (i.e. They consider all instances and give a statement about
the global relationship of a feature with the predicted outcome) and Local (i.e.
They explain the conditional interaction between the predictions and the pre-
dictors/attributes concerning a single prediction). This study deals with local
interpretation techniques, in particular, LIME.
In 2016, Ribeiro et al. [14] explained the predictions of any classifier by learn-
ing an interpretable model (linear models) locally around a prediction. They
called this explanation system LIME, Local Interpretable Model-agnostic Expla-
nations, where model-agnostic refers to the system treating the model as a black
box and not needing any prior information about its architecture.
What LIME does concretely is it tests how the predictions change when
variations of the data are given to the model. It perturbs the data-set and gets
the black box predictions for these new points. On this new data-set, LIME then
trains a weighted, interpretable model like a linear classifier. The linear classifier
will then be the learned explanation that is locally (but not globally) faithful.
This kind of accuracy is also called local fidelity [14]. Mathematically, local
surrogate models with interpretability constraint can be expressed as follows:
explanation(x) = arg min L(f, g, πx ) + Ω(g) (1)

g∈G
The explanation model for an instance x is the model g (linear regression)

that minimizes loss L (mean squared error), which measures how close the
explanation is to the prediction of the original model f, while the model complex-
ity Ω(g) is kept low (prefer fewer features). G is the family of possible explana-
tions, for example, all possible linear regression models [15], and the proximity
measure πx defines how large the neighborhood around instance x is that we
consider for the explanation. In practice, LIME only optimizes the loss part and
the user has to determine the complexity by selecting the maximum number of
features that the linear regression model may use.
Although the explanation of a single prediction provides some understanding
of the model, it is not sufficient to evaluate and assess trust in the model as a
whole. Therefore, Ribeiro et al. [14] proposed to explain a judiciously picked set
of individual instances using the SubmodularPick, an algorithm that helps to
select a representative set to simulate a global understanding of the model. For
example, if an explanation A was explained relying on two features x1 and x2 ,
there is no need to show the end-user another explanation that focused on the
same features x1 and x2 .
3 Related Work
A variety of work has been done on ML models’ interpretation. Since the ML
community has noticed the importance of explaining which features a model did
or did not take into account more than which parameter increased its accuracy
[16]. Interpretability aims increasing model trustworthiness so that it can be
used to make high-stakes decisions, especially in domains such as the medical
one [17].
ML explainability is thus a major concern since it has the power to break even
the models with the highest accuracy. When it comes to using a deployed model
to make decisions, end-users often ask the almighty question: “Why should I
trust it?”.
In 2002, Idri et al. [11] asked a slightly different question: “Can neural net-
works be easily interpreted in software cost estimation?” and they used the
method of [18] to map MLP to a fuzzy rule-based system which expresses the
information encoded in the architecture of the network and which can be easily
interpreted. Although they found the i-or operator that connected rules inap-
propriate.
LIME framework was applied in different domains such as medicine and
finance [16]. Particularly, it was applied by the Skater team [8] in different fields
one of them was breast cancer diagnosis where they used the Breast Cancer
Wisconsin (Diagnostic) Database, available on the UCI repository, to train four
models including an MLP. They discussed how sensitive each classifier is to the
attributes to show how interpretation techniques can help with model under-
standing for model selection.
Puri et al. [16] proposed an approach that can be thought of as an extension
to LIME. It learns if-then rules that represent the global behavior of a model
that solves a classification problem. Then validated their approach using different
data-sets, particularly the Wisconsin Breast Cancer data-set, to train a random
forest classifier where they got an accuracy of 98%. After running their technique,
they compared the model predictions to the predictions of the resulted if-then
rules, and they computed a metric they introduced: Imitation@K where K is
the number of rules. Their approach generated rules able to imitate the model
behavior for a large fraction of the data-set.
4 Database Description and Performance Criteria

In this section, the database used to train the models is described. A set of
performance measures is then defined to help select the best classifier.
4.1 WISCONSIN Database

In this work, the Breast Cancer Wisconsin (Original) data-set was used [19].
The data-set is available on the UCI repository and has 9 attributes which
values are in the bucket 1.10. It has 458 benign and 241 malignant cases which
shows an imbalance problem. The imbalanced distribution of classes constitutes
a challenge for standard learning algorithms because they are biased towards
the majority classes. Different resampling methods modify the original class
distributions to tackle the imbalance issue. In particular, Synthetic Minority
Oversampling Technique (SMOTE) algorithm [20], which was used in this study
to balance the data-set. Also, the data-set had 16 missing values which were
removed from the set before resampling.
4.2 Performance
The best model isn’t necessarily the one with higher accuracy. Therefore, multi-
ple performance classification measures were used to select the best MLP archi-
tecture:
– Accuracy: The most intuitive performance measure and it is simply a ratio
of correctly predicted observation to the total observations.
– Precision: The ratio of correctly predicted positive observations to the total
predicted positive observations.
– Recall (Sensitivity): The ratio of correctly predicted positive observations to
the all observations in actual class ‘yes’.
– F1-Score: The weighted average of Precision and Recall. Therefore, this score
takes both false positives and false negatives into account.
TP + TN
Accuracy = (2)
TP + FP + FN + TN
TP
P recision = (3)
TP + FP
TP + TN
Recall = (4)
TP + FN
P recision ∗ Recall
F 1 − Score = 2 ∗ (5)
P recision + Recall
5 Experimental Design
This Section presents the experimental process followed in this empirical evalua-
tion. It consists of choosing the parameters for models construction, selecting the
best performing model and fixing the number of LIME explanations to discuss.
5.1 Models Construction
Two MLP architectures were adopted and since our aim was interpretability
and not performance, hyperparameters were chosen randomly. We constructed a
basic MLP with 200 hidden nodes and a deep MLP where another hidden layer
of 128 nodes was added. To make training faster, the non-linearity activation
function ReLU (Rectified Linear Units) was used, since it makes training several
times faster than with other function such as tanH [21]. As to the output layer,
we opted for a softmax activation function and thus we had two neurons since
we have two classes (Malignant/Benign).
The dropout technique was used twice in the deep MLP to avoid over-fitting,
in the first hidden layer with a 0.5 probability and in the second layer with a
probability of 0.8. The neurons which are dropped out in this way do not con-
tribute to the forward pass and do not participate in back-propagation. So every
time an input is presented, the neural network samples a different architecture,
but all these architectures share weights. This technique reduces complex co-
adaptations of neurons since a neuron can not rely on the presence of particular
other neurons [21].
After the training of 500 epochs (training cycles) and a batch size of 128, the
Accuracy (Eq. 2) and F1-Score (Eq. 5) were voters to choose the best MLP. Here
BordaCount [22], a voting system was used. When each individual of a group
ranks m candidates, Borda has each assign 0 to its last-ranked candidate, 1 to
the second-to-the-last-ranked, until it assigns n − 1 to its top-ranked, and then,
for each of the candidates, sums over those numbers to determine the group
ranking [23].
K-folds Cross-validation is a validation technique that ensures that the model
is low on bias and that it can work well for the real unseen data. The data is
divided into k subsets, where one of the k subsets is used as the test/validation
set and the other (k − 1) subsets form the training set [24]. After k iterations,
the error and performance metrics are computed by getting the average over all
k folds. As a general rule and empirical evidence, 5 or 10 is generally preferred
for k [24]. For the experiments of the present study 10 folds cross validation was
performed.
5.2 Interpretability Techniques
As cross validation’s purpose is model checking and not model building, after
selecting the best model to interpret in terms of Accuracy and F1-Score, the
chosen model was fitted with a normal split, feeding the maximum amount to
the training leaving 0.05 for the testing.
This phase aimed to interpret the selected model locally and define on which
features it relies on the most and how each feature affects the final decision.
A four instances set was chosen using the SubmodularPick algorithm provided
with LIME. LIME explanations were then computed for the four instances.
6 Results and Discussion
This section shows the results of the empirical evaluations of this study and
discusses the LIME provided local explanations.
6.1 Models Construction
After 10 folds cross validation training with 500 epochs, two models were con-
structed:
– MLP: A basic three layers MLP with 200 nodes/neurons with ReLu activation
function in the hidden layer, and two nodes for the output with a Softmax
activation function.
– Deep MLP (DMLP): Another hidden layer of 128 nodes was added to the
basic MLP. The dropout technique was used for both hidden layers.
Table 1 shows the accuracy and F1-Score values of the two models MLP and
DMLP. Note that F1-Score is the weighted average of precision and recall.
Table 1. Models performances
Model Accuracy F1-Score

MLP 71.20% 60.24%
DMLP 92.95% 98.74%
DMLP did better, which was expected since the added layer helped in doing
more computations and recognizing more useful patterns to distinguish between
the two classes [25]. Both, the accuracy and F1-Score were voters for the two
MLP candidates to choose the best MLP. Therefore, the BordaCount voting
system was used to select the best MLP architecture which turned out to be the
DMLP.
Fig. 1. LIME explanations of the first four instances by SubmodularPick
6.2 Interpretability Results
LIME was used for the local interpretability of the best MLP model. It focuses
on training local surrogate models (interpretable models) to explain individual
predictions and so the decider can understand why the model predicted a certain
class for a particular instance. A set of four instances was thus chosen using the
SubmodularPick algorithm for the best model, then the plotted explanations are
further interpreted to understand how the best model uses the features.
In Fig. 1, we notice in the first upper explanations how the cellSizeUniformity,
the barNuclei as well as the cellShapeUniformity and normalNucleoil switched
the prediction when they all increased which shows that instances tend to be
classified benign more when the uniformities are very low.
In the third explanation, the barNuclei being in the 2.9 bucket and the uni-
formities in 3.6 had a huge impact on classifying the instance as malignant and
although the blandChromatin was higher than 6 which voted for the benignity of
the instance, the model’s decision was affected more by the first three features.
In the fourth explanation, it was again the uniformalities features as well
as the barNuclei and normalNucleoil that affected the prediction the most but
differently, some “voted” for malignant such as normalNucleoil and cellShapeU-
niformity since the first was high and the second was average, while the others,
barNuclei, and cellSizeUniformity “voted” for benign since they were very low.
7 Threats to Validity
In this work, parameter tuning was ignored since we focused on interpretability,
so no parameter tuning technique was used, which can be interesting and might
give better results [5].
The medical domain is a very critical one, using one data-set is not enough
to select the best classifier nor to define its trustworthiness. Moreover, to check
the reliability of explanations, using other interpretability techniques than LIME
would also be helpful to understand the model.
The interpretability techniques were applied to an MLP. Constructing and
interpreting other types of black-box models such as Support Vector Machine
will help understand those techniques more and generalize their effectiveness in
interpretation.
8 Conclusion and Future Work

The present study constructed and evaluated two MLPs where the deeper one
did better, from which we explained four instances using LIME. LIME inter-
pretability framework proved to be highly useful in understanding the model’s
behavior and increasing trustworthiness which can help domain experts under-
stand or discover hidden patterns. We noticed that the model focused more on
the uniformities and barNuclei features to decide on the prediction.
Ongoing work will focus on checking other interpretability techniques
whether global or local since it will be interesting to enhance the trustworthiness
of models as well as the understanding of how those techniques work.
If a model is not highly interpretable, the domain might not be legally per-
mitted to use its insights to make changes to processes. ML explainability is
defying black box non-transparency to gain both, high accuracy and high inter-
pretability.
References
1. Al-Hajj, M., Wicha, M.S., Benito-Hernandez, A., Morrison, S.J., Clarke, M.F.:
Prospective identification of tumorigenic breast cancer cells. Proc. Nat. Acad. Sci.
100(11), 6890 (2003). Correction to 100(7):3983
2. Solanki, K.: Application of data mining techniques in healthcare data, vol. 148,
no. 2, p. 1622 (2016)
3. Idri, A., Chlioui, I., El Ouassif, B.: A systematic map of data analytics in breast
cancer. In: ACM International Conference Proceeding Series. Association for Com-
puting Machinery (2018)
4. Hosni, M., Abnane, I., Idri, A., de Gea, J.M.C., Fernandez Aleman, J.L.: Review-
ing ensemble classification methods in breast cancer. Comput. Methods Programs
Biomed. 177, 89–112 (2019)
5. Idri, A., Hosni, M., Abnane, I., de Gea, J.M.C., Fernandez Aleman, J.L.: Impact
of parameter tuning on machine learning based breast cancer classification. In:
Advances in Intelligent Systems and Computing, vol. 932, pp. 115–125. Springer
(2019)
6. Chlioui, I., Idri, A., Abnane, I., de Gea, J.M.C., Fernandez Aleman, J.L.:. Breast
cancer classification with missing data imputation. In: Advances in Intelligent Sys-
tems and Computing, vol. 932, pp. 13–23. Springer (2019)
7. Aurangzeb, A.M., Eckert, C., Teredesai, A.: Interpretable machine learning in
healthcare. In: Proceedings of the 2018 ACM International Conference on Bioinfor-
matics, Computational Biology, and Health Informatics, BCB 2018, pp. 559–560.
ACM Press, New York (2018)
8. Oracle’s unified framework for Model Interpretation. https://github.com/oracle/
Skater
9. Thomas, A.: An introduction to neural networks for beginners. Technical report in
Adventures in Machine Learning (2017)
10. Gardner, M.W., Dorling, S.R.: Artificial neural networks (the multilayer percep-
tron) - a review of applications in the atmospheric sciences. Atmos. Environ. 32(14–
15), 2627–2636 (1998)
11. Idri, A., Khoshgoftaar, T., Abran, A.: Can neural networks be easily interpreted
in software cost estimation? In: 2002 IEEE World Congress on Computational
Intelligence. IEEE International Conference on Fuzzy Systems, FUZZ-IEEE 2002.
Proceedings (Cat. No.02CH37291), vol. 2, pp. 1162–1167. IEEE (2002)
12. Miller, T.: Explanation in artificial intelligence: insights from the social sciences.
Artif. Intell. J. 267, 1–38 (2017)
13. Kim, B., Khanna, R., Koyejo, O.: Examples are not enough, learn to criticize! Crit-
icism for interpretability. In: Advances in Neural Information Processing Systems
(NIPS 2016), vol. 29 (2016)
14. Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should I trust you?” Explaining the
predictions of any classifier. In: Proceedings of the ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, 13–17 August 2016, pp.
1135–1144. Association for Computing Machinery (2016)
15. Molnar, C.: Interpretable Machine Learning. A Guide for Making Black Box Models
Explainable (2018). https://christophm.github.io/book/
16. Puri, N., Gupta, P., Agarwal, P., Verma, S., Krishnamurthy, B.: MAGIX: model
agnostic globally interpretable explanations (arXiv) (2017)
17. Lazzeri, F.: Automated and Interpretable Machine Learning - Microsoft Azure -
Medium (2019)
18. Benitez, J.M., Castro, J.L., Requena, I.: Are artificial neural networks black boxes?
IEEE Trans. Neural Netw. 8(5), 1156–1164 (1997)
19. https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original)
20. Chawla, N., Bowyer, K., Hall, L., Kegelmeyer, W.: SMOTE: synthetic minority
over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
21. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep con-
volutional neural networks. In: Advances in Neural Information Processing Sys-
tems, vol. 25, no. 2 (2012)
22. de Borda, J.C.: Mémoire sur les élections au scrutin, Mémoire de l’Académie
Royale. Histoire de l’Académie des Sciences, Paris, pp. 657–665 (1781)
23. Risse, M.: Why the count de Borda cannot beat the Marquis de Condorcet. Soc.
Choice Welfare 25(1), 95–113 (2005)
24. Gupta, P.: Cross-Validation in Machine Learning - Towards Data Science (2017)
25. Reed, R., MarksII, R.J.: Neural Smithing: Supervised Learning in Feedforward
Artificial Neural Networks, p. 38 (1999)
Energy Efficiency and Usability
of Web-Based Personal Health Records
José Alberto Garcı́a-Berná1(B) , Sofia Ouhbi2 , José Luis Fernández-Alemán1 ,

Juan Manuel Carrillo-de-Gea1 , and Joaquı́n Nicolás1
1
Department of Informatics and Systems, Faculty of Computer Science,
University of Murcia, Murcia, Spain
{josealberto.garcia1,aleman,jmcdg1,jnr}@um.es
2
Department of Computer Science and Software Engineering, CIT,
United Arab Emirates University, Al Ain, UAE
[email protected]
Abstract. Usability is a critical aspect in the adoption of e-health appli-

cations. However, its impact on energy consumption has not been thor-
oughly studied in e-health domain. The aim of this paper is to investigate
the relationship between energy efficiency and usability in the context
of personal health records (PHRs). A total of 5 web-based PHRs out
of 19 were selected for this study and the energy consumption of these
PHRs was assessed when performing 20 tasks. The results showed that
the PHRs with best practices of usability consumed more energy than
the others. Based on the findings of this study, recommendations for
practitioners on how to trade-off between usability and energy efficiency
were proposed.
Keywords: Energy efficiency · Usability · Software sustainability ·

Green software · Personal health records · e-health
1 Introduction
Energy efficiency and energy consumption are considered as fundamental sus-
tainability characteristics in architectural design [29]. Whilst energy consump-
tion is the amount of power used to operate a technology, energy efficiency refers
to the use of as little energy as possible in a particular system. Software sustain-
ability is attracting attention of researchers [3,27,28]. Sustainability in e-health
technologies is an important challenge for the healthcare industry [20]. Energy
efficiency will need improving due to the large amount of health data that will
This research is part of the BIZDEVOPS-GLOBAL-UMU (RTI2018-098309-B-C33)
project, and the Network of Excellence in Software Quality and Sustainability
(TIN2017-90689-REDT). Both projects are supported by the Spanish Ministry of
Science, Innovation and Universities and the European Regional Development Fund
(ERDF).
26 J. A. Garcı́a-Berná et al.
be accumulated due to advancement in Information and Communications Tech-

nology (ICT) [5]. Personal health records (PHRs) are a promising solution to
the sustainability gap in the public healthcare systems. There is a rapid growth
in literature related to PHRs in spite of the fact that this technology is in its
beginnings [15]. PHRs are tools that allow people to store and manage their
health data [10]. PHRs are managed by the users themselves and have the poten-
tial to empower patients [11]. Providing preventive measures and self-treatment
instructions to people could lead to decrease the demand of healthcare [22].
The usability of PHRs plays an important role for a good acceptance among
the users. Poor usability and functionality have been proven to result in a low
utility, which affects the enrollment’s rates by both patients and clinicians alike
[6]. Patients have a positive attitude towards sharing medical information as
they consider it a way to promote a better healthcare service [14], but they are
usually concerned with the usability and privacy issues when accessing e-health
applications [25]. Usability helps healthcare organizations in the customization
of e-health interventions [17]. To this end, several approaches to evaluate usabil-
ity in e-health applications were used, such as: end-user surveys [9], and think-
aloud protocol [30]. Usability requirements were proposed in a reusable require-
ments catalogue for sustainable connected health applications [21]. Although
some studies about energy efficiency and usability in e-health can be found, the
relationship of these two factors in PHRs has not been studied before to the
best of our knowledge. Motivated by the lack of research that evaluates the rela-
tionship of these two important factors, this paper studied the relation between
energy consumption and usability gathered from a set of tasks performed in a
group of web-based PHRs.
The remainder of this paper is organized as follows: Sect. 2 presents the mate-
rials and method used in this study, Sect. 3 presents the results, Sect. 4 presents
recommendations based on the results, and Sect. 5 concludes with principal find-
ings and future works.
2 Materials and Methods

2.1 PHRs Under Study
The PHRs were selected from a previous study [10]. The method proposed by the
Preferred Reporting Items for Systematic reviews and Meta-Analysis (PRISMA)
group [19] was employed for accurateness and impartiality of the selection. The
search was performed at myPHR.com and in ACM Digital Library, IEEE Digital
Library, Med-line and ScienceDirect. Web-based format was the inclusion crite-
rion (IC) with which a total of 19 PHRs were collected initially (see Fig. 1). A
refinement of the results was performed with the following exclusion criteria: non-
available PHRs (EC1), non-free PHRs (EC2), registration not possible (EC3),
mal-functioning (EC4), only available in USA (EC5), and low-popularity PHRs
(EC6). EC6 was applied through Alexa website (alexa.com/siteinfo), which is a
sorting online tool to verify visits in web portals.
Energy Efficiency and Usability of Web-Based Personal Health Records 27
Fig. 1. PRISMA flow chart
From the results of 19 PHRs selected, those that met ECs were discarded. A
first rejection of PHRs was carried out. HealthyCircles, Telemedical, Dr. I-Net,
MedsFile.com, ZebraHealth, EMRySTICK and Dlife were dropped due to EC1,
myMediConnect and Juniper Health because of EC2, RememberItNow! by EC3,
WebMD HealthManager by EC4, and PatientPower by EC5. Finally another
round to discard more PHRs was done. In this case, My Health Folders and My
Doclopedia fulfilled EC6. The Alexa ranking mark exposed a very low popularity
of these portals—the higher the mark, the less popular a website is—. The Alexa
mark in some cases was not available which lead to a low popularity of the
portal consideration. Finally, the PHRs selected were HealthVet, PatientsLikeMe,
HealthVault, Health Companion, and NoMoreClipBoard. All these PHRs covered
as many functionalities as possible provided by this type of tools.
The use of PHRs was analyzed by carrying out a set of identified tasks with
common needs detected for a better interaction among several usage profiles
[4]. The recommendations for the development of a PHR from the American
Health Information Management Association [2] were also taken into account to
propose the tasks to be performed in the PHRs. Table 1 depicts a list of the 20
PHR common tasks identified.
Table 1. Typical tasks identified in a PHR
TASK 01: Registration

TASK 02: System access
TASK 03: Add profile
TASK 04: View profile
TASK 05: Manage permissions to 3rd parties
TASK 06: Add family history
TASK 07: Add medication
TASK 08: Add new allergy
TASK 09: Add vaccine
TASK 10: Add disease
TASK 11: View medications
TASK 12: Print report
TASK 13: View glucose evolution
TASK 14: Search for information about conditions
TASK 15: Export health info
TASK 16: Schedule appointments and medication reminder
TASK 17: Send suggestion/contact
TASK 18: See privacy policy
TASK 19: Exit
TASK 20: Forgotten password
2.2 Power Measurements Procurement
The power expenses during the performance of the tasks were measured with the
Energy Efficient Tester (EET) [18]. This device is provided with sensors capable
to measure the instant power consumption of the processor, hard disk drive,
graphic card, monitor and total power supplied to a host machine.
The experiment was carried out using EET connected to a thin film
transistor-liquid crystal display (TFT-LCD) monitor Philips 170S6FS and a PC
provided with a GigaByte GA-8I945P-G motherboard, an Intel Pentium D @
3.0 Ghz processor, a set of 2 modules of 1 GB DDR2 @ 533 MHz RAM memory,
a Samsung SP2004C 200 GB 7500 rpm hard disk drive, a Nvidia GeForce GTS
8600 graphics card, and a Aopen Z350-08FC 350 W power supply.
Data were checked to ensure the absence of outliers. To this end, each task
was carried out five times in order to average the results and to smooth any
peak red of power consumption that may had occurred. The operating system
installed in the PC was Microsoft Windows 7 Professional, which allowed to
disable background running processes and reduce the resources required by the
operating system.
2.3 Usability Assessment

In this paper, Usability is defined according to ISO/IEC 25010-2011 standard as
“the degree to which a product or system can be used by specified users to achieve
specified goals with effectiveness, efficiency and satisfaction in a specified context
of use.” Usability is defined into six characteristics [1]:
– Appropriateness recognizability: degree to which users can recognise whether
a product or system is appropriate for their needs.
– Learnability: degree to which a product or system can be used by specified
goals of learning to use the product with effectiveness, efficiency, freedom
from risk and satisfaction in a specified context of use.
– Operability: degree to which a product or system has attributes that make it
easy to operate and control.
– User error protection: degree to which a system protects users against making
errors.
– User interface aesthetics: degree to which a user interface enables pleasing
and satisfying interaction for the user.
– Accessibility: degree to which a product or system can be used by people with
the widest range of characteristics and capabilities to achieve a specified goal
in a specified context of use.
Because we are interested in energy efficiency and its relationship with usabil-
ity, we will consider only the following two sub-characteristics: user interface
aesthetics and operability. Other usability sub-characteristics, such as user error
protection, were not considered because of the design of this study wish will
not allow us to investigate thoroughly their impact on energy consumption. For
Operability, the following quality factors will be analyzed in each web-based PHR
selected in this study: Understandable categorization of information, Appearance
consistency, Operational consistency, and Message clarity. For Interface aesthet-
ics, one quality attribute is assigned to this sub-characteristic by ISO/IEC 25023,
which is appearance aesthetics of user interface. This quality attribute can be
used to determine to what extend the user interface and overall design are aes-
thetically pleasing in appearance.
3 Results
3.1 Energy Consumption of the Selected Web-Based PHRs
The average energy consumption for each PHR and each sensor was calculated
with the data available in this supplementary file (http://tiny.cc/9w72fz). In
Table 2 the cells with the largest values were coloured in red whereas the ones
with the smallest values in green. In NoMoreClipBoard the lowest energy con-
sumption appeared in three sensors (monitor, processor and PC), in HealtVault
in one sensor (hard disk) and in PatiensLikeMe in another one sensor (graphics
card). In contrast, Health Companion spent the highest amount of energy in
three of the sensors (graphics card, hard disk and monitor) and PatiensLikeMe
in two of them (processor and the whole PC).
Table 2. Average power consumption in Watts
Web-based PHR Graphics card Hard disk Monitor Processor PC

HealthVet 1.4070701 14.387069 59.249537 3.9319199 193.34284
HealthVault 1.4005134 14.373165 58.584966 3.3334926 184.47266
NoMoreClipboard 1.3844928 14.408445 57.699905 3.1624559 181.11312
PatientsLikeMe 1.3512662 14.383003 60.515900 5.2833229 228.16677
Health Companion 1.4109732 14.425882 62.705551 4.1794896 204.83057
3.2 Operability and Energy Efficiency

Only Health Companion presented a processing icon while the page is loading
in most of the cases. Although it is not an environment-friendly feature, loading
icon or image can improve usability of the website.
The scrollbars were present in all the PHRs. This widget was necessary due
to the resolution of the display. Processing icons were less common but were
found in Health Companion when performing task 15, export health info. The
lowest energy consumptions were measured in NoMoreClipBoard for the same
components, in which no loading icons appeared when performing this task.
Scrollbars are widely used. Nevertheless, the need to use scrollbars might be
determined by the dimensions and resolution of the display. Although they are
not recommended for an efficient energy consumption, their availability simplifies
portal’s usage [7,16].
NoMoreClipBoard showed all the health data at once in “Task 4: View pro-
file”. In contrast, PatientsLikeMe had a wall of updates from the community,
resembling a social network. This widget in PatientsLikeMe required refreshing
the web page, which leaded to a high-energy demanding feature. In NoMore-
ClipBoard all the medical information was accessible at once in “Task 11: View
Medications”, and thematic icons were present in this case. HealthVet was the
opposite with no thematic icons in Task 11.
Autocomplete allowed to find an illness name that may be difficult to remem-
ber at first. In the evaluation of “Task 14: Search for information about condi-
tions”, extreme amounts of power were spent by the hard disk, the graphics card
and the processor in HealthVet and PatientsLikeMe. HealthVet was the one with
the maximum power measurements and PatientsLikeMe the one with the mini-
mum values. This could be explain because in HealthVet a new tab is opened in
the browser to complete the task, making it more complex.
Although NoMoreClipBoard, HealthVet, HealthVault and PatientsLikeMe
had an efficient user interface, these PHRs should improve the energy consump-
tion by simplifying the interaction with the user, and reducing the time required
to perform specific operations as shown in Table 3.
3.3 Interface Aesthetics and Energy Efficiency

In terms of power consumption, the lowest CPU energy consumption appeared
in NoMoreClipBoard, whilst the highest necessities of power when carrying out
Table 3. Time required to complete the tasks in each PHR
Task HealthVet HealthVault NoMoreClipBoard PatientsLikeMe Health

Companion
#1 1’17.45” 1’36.79” 1.28” 1’13.05” 2’03.99”
#2 19.33” 45.36” 21.47” 20.26” 14.77”
#3 1’30.24” 26.28” 5’32.05” 1’56.26” 35.11”
#4 8,61” 2.30” 5.92” 10.14” 8.48”
#5 Not available 40.72” 17.54” 8.87” 19.82”
#6 2’05.68” 33.54” 3’44.35” Not available 1’22.56”
#7 1’23.60” 49.66” 46.10” 50.63” 41.20”
#8 1’17.36” 53.51” 22.40” Not available 34.20”
#9 50.56” 1’00.17” 29.85” Not available 43.53”
# 10 47.13” 33” 25.32” 51.34” 42.62”
# 11 5.87” 4.61” 4.94” 5.30” 9.68”
# 12 18.85” 10.06” 22.54” Not available 23.36”
# 13 11.31” 6.76” 18.45” 9.43” 37.22”
# 14 19.38” Not available Not available 12.15” Not available
# 15 33.79” 15.33” 19.92” Not available 23.23”
# 16 51.29” 44.70” Not available Not available 28.16”
# 17 1’09.95” 1’59.10” 27.84” 22.77” 25.23”
# 18 5.95” 4.52” 6.34” 7.09” 4.30”
# 19 3” 10.62” 5.36” 2.69” 4.33”
# 20 Not available 47.80” Not available 20.71” Not available
the proposed tasks were generated in PatientsLikeMe, despite of having a good

design in human-interface interaction. Both PHRs had an important difference
concerning the user interface that could explain the different necessities of power.
A total of two main graphic user interface (GUI) factors impacted on the dis-
play’s energy efficiency [26]: energy color scheme and screen changes. PHRs were
divided according to GUI complexity. Health Companion’s GUI was the simplest
one, HealthVet, HealthVault and PatientsLikeMe presented a middle complexity,
and NoMoreClipBoard had the most overelaborate GUI. Bearing these groups
in mind, the major the complexity, the greater amount of energy spent by the
graphics card in “Task 6: Add family history”. This component spent 1.35 W in
HealthCompanion and 1.41 W in NoMoreClipBoard.
Significant differences in power consumed by the monitor were found between
NoMoreClipBoard and Health Companion. This could be explained by the fact
that both PHRs had a different color scheme. There were more dark areas in
Health Companion than in NoMoreClipBoard, turning in a higher necessity of
power for Health Companion. Solid colours stood out in Health Companion,
whereas degraded colors were common in NoMoreClipBoard, defining the color
scheme of the PHRs. Moreover, the GUI of NoMoreClipBoard was brighter than
that of Health Companion, defining also the power needs by the TFT monitor
in the experiment. Solid colors appeared in most of the cases in the GUI of the
PHRs. NoMoreClipBoard was the only one with degraded colors, impacting on
the switching characteristic of the screen shots. The LCD-TFT employed in this
experiment generated a higher power consumption when dark tonalities were
shown in the monitor, which could be related to the textures in NoMoreClip-
Board [26].
Biggest widgets were relevant in terms of power needs when performing the
tasks. Biggest widget were showed from task 7 to 10 in Health Companion. In
addition, the energy spent by the monitor in this PHR stood out. The solid color
scheme and the dark tonalities of this PHR could explained a low energy need.
In HealthVet big widgets also appeared, moreover, they were closer to each other
in the aforementioned tasks. This PHR revealed the lowest power consumption
for the hard disk drive.
4 Discussion
After analysing the results, several alternatives to be implemented in the PHRs

have been identified, which may reduce time required to complete any task and
consequently reduce power consumption.
1. Automatic jumping, Whenever the number of characters is known in some

fields of a form (i.e. phone number, dates, insurance number, etc.), the auto-
matic jumping of the cursor could be implemented in the PHRs. HealthVet
had a form where the cursor moved to the next field when completing the
Social Security Number.
2. Macros. The congregation of a set of actions in a macro produces a more
efficient user interface [23]. Nevertheless, there were not found macros in any
of the PHRs studied, and this feature could improve sustainability of these
tools.
3. Autocompletion. Input caches are relevant when a reduced number of known
inputs occurs very often. Previous experiments proved that autocomple-
tion functions made more energy-efficient when typing consisted of, at least,
three additional letters [26]. The PHRs PatientsLikeMe, HealthVet, NoMore-
ClipBoard and Health Companion had autocompletion by catching previous
inputs to reduce the input time. This feature was present in the reason of a
hospitalization, the name of a medical test, conditions, symptoms and treat-
ments for PatientsLikeMe, and the name of the insurance company, medical
providers, medications, illnesses, medical procedures, immunizations, allergies
and conditions for NoMoreClipBoard. HealthVet autocomplete is available
when searching for information about conditions and in Health Companion
to add a condition, laboratory data, medications and vaccine. HealthVault
did not have autofilling. Observe that this feature can be useful to start to
familiarize with a particular medical situation, especially with health names
that can be difficult to remember at first.
4. Hick-Hyman Law. This law postulates that the human cognitive process of
taking decision can be accelerated [12,13]. People divide the number of options
into groups, eliminating around half of the remaining choices at each step.
A logarithmic relationship between reaction time and the number of choices
available is proposed in this law. In addition, when each option one at a time
must be considered by the users, the relationship between response time and
the number of choices available has been found to be linear [8]. Therefore,
few choices as possible should be available in a GUI to take advance of Hick-
Hyman Law. To this end, the most common functionality should be split into
a smaller menu [24]. HealthVault, HealthVet and PatientsLikeMe proceeded
with this law. The navigation to find the information in these PHRs was
divided into drop-down menus. There was a left column with the main options
of the PHR in HealthVault and a second level menu to view the medical
information. In HealthVet and PatientsLikeMe there was a first level menu
with the main options in the headline of the web. After clicking on this menu,
a left column appeared to retrieve the medical information.
5 Conclusions and Future Works
This paper investigated the relationship between usability and energy efficiency
of five web-based PHRs. The findings showed that meeting both usability and
consuming less energy is challenging as it depends on factors related to the
hardware employed and the users’ manipulation of the system. Results allowed
us to suggest recommendations about energy efficiency of PHRs. However, the
inclusion of only five PHRs might have impacted the results. Further studies with
more PHRs and more tasks to be performed should be conducted to confirm our
results. For future work, we intend to propose a reusable requirement catalogue
for usable and energy efficient e-health applications and use energy management
systems during the performance of tasks to validate our catalogue.
References
1. ISO/IEC 25010 standard. Systems and Software Engineering – Systems and Soft-
ware Quality Requirements and Evaluation (SQuaRE) – System and Software
Quality Models (2011)
2. AHIMA: American health information management association (2019). Accesses
Dec 2019. http://www.ahima.org/
3. Ahmad, R., Hussain, A., Baharom, F.: A systematic review on characteristic and
sub-characteristic for software development towards software sustainability. Envi-
ronment 20, 34 (2015)
4. Archer, N., Fevrier-Thomas, U., Lokker, C., McKibbon, K.A., Straus, S.E.: Per-
sonal health records: a scoping review. J. Am. Med. Inform. Assoc. 18(4), 515–522
(2011)
5. Bhatt, C., Dey, N., Ashour, A.S.: Internet of Things and Big Data Technologies
for Next Generation Healthcare, vol. 23. Springer, Cham (2017)
6. Bidargaddi, N., van Kasteren, Y., Musiat, P., Kidd, M.: Developing a third-party
analytics application using Australia’s national personal health records system:
case study. JMIR Med. Inform. 6(2), e28 (2018)
7. Breuninger, J., Popova-Dlugosch, S., Bengler, K.: The safest way to scroll a list: a
usability study comparing different ways of scrolling through lists on touch screen
devices. IFAC Proc. Vol. 46(15), 44–51 (2013)
8. Cockburn, A., Gutwin, C.: A predictive model of human performance with scrolling
and hierarchical lists. Hum.-Comput. Interact. 24(3), 273–314 (2009)
9. Farzandipour, M., Meidani, Z., Riazi, H., Sadeqi Jabali, M.: Task-specific usability
requirements of electronic medical records systems: lessons learned from a national
survey of end-users. Inform. Health Soc. Care 43(3), 280–299 (2018)
10. Fernández-Alemán, J.L., Seva-Llor, C.L., Toval, A., Ouhbi, S., Fernández-Luque,
L.: Free web-based personal health records: an analysis of functionality. J. Med.
Syst. 37(6), 9990 (2013)
11. Helmer, A., Lipprandt, M., Frenken, T., Eichelberg, M., Hein, A.: Empowering
patients through personal health records: a survey of existing third-party web-
based PHR products. Electron. J. Health Inform. 6(3), 26 (2011)
12. Hick, W.E.: On the rate of gain of information. Q. J. Exp. Psychol. 4(1), 11–26
(1952)
13. Hyman, R.: Stimulus information as a determinant of reaction time. J. Exp. Psy-
chol. 45(3), 188 (1953)
14. Karampela, M., Ouhbi, S., Isomursu, M.: Exploring users’ willingness to share
their health and personal data under the prism of the new GDPR: implications in
healthcare. In: 41st Annual International Conference of the IEEE Engineering in
Medicine and Biology Society (EMBC), pp. 6509–6512. IEEE (2019)
15. Kelly, M.M., Coller, R.J., Hoonakker, P.: Inpatient portals for hospitalized patients
and caregivers: a systematic review. J. Hosp. Med. 13(6), 405–412 (2018)
16. Leung, R., MacLean, K., Bertelsen, M.B., Saubhasik, M.: Evaluation of haptically
augmented touchscreen GUI elements under cognitive load. In: 9th International
Conference on Multimodal Interfaces, pp. 374–381. ACM (2007)
17. Lyerla, F., Durbin, C.R., Henderson, R.: Development of a nursing electronic med-
ical record usability protocol. CIN: Comput. Inform. Nurs. 36(8), 393–397 (2018)
18. Mancebo, J., Arriaga, H.O., Garcı́a, F., Moraga, M., Garcı́a-Rodrı́guez de Guzmán,
I., Calero, C.: EET: a device to support the measurement of software consumption.
In: 6th International Workshop on Green and Sustainable Software (GREENS),
pp. 16–22 (2018)
19. Moher, D., Liberati, A., Tetzlaff, J., Altman, D.G.: Preferred reporting items
for systematic reviews and meta-analyses: the prisma statement. Ann. Int. Med.
151(4), 264–269 (2009)
20. Ouhbi, S.: Sustainability and internationalization requirements for connected
health services: method and applications. Proyecto de investigación (2018)
21. Ouhbi, S., Fernández-Alemán, J.L., Toval, A., Rivera Pozo, J., Idri, A.: Sustain-
ability requirements for connected health applications. J. Softw.: Evol. Process
30(7), e1922 (2018)
22. Rantanen, M.M., Koskinen, J.: Phr, we’ve had a problem here. In: IFIP Interna-
tional Conference on Human Choice and Computers, pp. 374–383. Springer (2018)
23. Savelyev, A., Brookes, E.: GenApp: extensible tool for rapid generation of web and
native GUI applications. Future Gener. Comput. Syst. 94, 929–936 (2017)
24. Sears, A., Shneiderman, B.: Split menus: effectively using selection frequency to
organize menus. ACM Trans. Comput.-Hum. Interact. (TOCHI) 1(1), 27–51 (1994)
25. Staccini, P., Lau, A.Y., et al.: Findings from 2017 on consumer health informatics
and education: health data access and sharing. Yearb. Med. Inform. 27(01), 163–
169 (2018)
26. Vallerio, K.S., Zhong, L., Jha, N.K.: Energy-efficient graphical user interface design.
IEEE Trans. Mob. Comput. 5(7), 846–859 (2006)
27. Venters, C., Lau, L., Griffiths, M., Holmes, V., Ward, R., Jay, C., Dibsdale, C., Xu,
J.: The blind men and the elephant: towards an empirical evaluation framework
for software sustainability. J. Open Res. Softw. 2(1), e8 (2014)
28. Venters, C.C., Capilla, R., Betz, S., Penzenstadler, B., Crick, T., Crouch, S., Naka-
gawa, E.Y., Becker, C., Carrillo, C.: Software sustainability: research and practice
from a software architecture viewpoint. J. Syst. Softw. 138, 174–188 (2018)
29. Villa, L., Cabezas, I., Lopez, M., Casas, O.: Towards a sustainable architectural
design by an adaptation of the architectural driven design method. In: International
Conference on Computational Science and Its Applications, pp. 71–86. Springer
(2016)
30. Yen, P.Y., Walker, D.M., Smith, J.M.G., Zhou, M.P., Menser, T.L., McAlearney,
A.S.: Usability evaluation of a commercial inpatient portal. Int. J. Med. Inform.
110, 10–18 (2018)
A Complete Prenatal Solution
for a Reproductive Health Unit in Morocco
Mariam Bachiri1, Ali Idri1,2(&), Taoufik Rachad1, Hassan Alami3,

and Leanne M. Redman4
1
Software Project Management Research Team, Department of Web
and Mobile Engineering, ENSIAS, Mohammed V University in Rabat,
Rabat, Morocco
[email protected]
2
CSEHS, Mohammed VI Polytechnic University, Ben Guerir, Morocco
3
Faculty of Medicine, Mohammed V University in Rabat, Rabat, Morocco
4
Pennington Biomedical Research Center, Baton Rouge, LA 70808, USA
Abstract. A prenatal mobile Personal Health Records (mPHR), along with an

Electronic Health Records (EHR) are, respectively, exploited in order to permit
both the pregnant women and gynecologists or obstetricians monitor the preg-
nancy progress in the best conditions. For this intent, a complete solution
consisting of a prenatal mPHR and an EHR were developed for the maternity
“Les Orangers” of the Avicenne University Hospital in Rabat. The complete
solution provides the main functionalities of a prenatal service. Thereafter, the
solution will be validated by conducting an experiment for quality and potential
assessment. Hence, a recruitment process has been determined to identify the
eligibility criteria to enroll participants (pregnant women and gynecologists), in
addition to planning the course of the experiment.
Keywords: Mobile personal health records Electronic health records

Experiment Prenatal Pregnancy
1 Introduction
A pregnancy can encounter disorders or medical conditions that might influence the
progress of the pregnancy. It can be either related to the obstetrical history, medical
complications, lifestyle choices, nutrition, cardiac diseases, diabetes or hypertensive
disorders [1]. These factors can produce health troubles before, during and after
delivery. This implies regular prenatal checkups with her obstetrician and gynecologist
in order to track her health and the baby’s health [2]. During these checkups, health
data related to the pregnant woman and her baby are registered in their health records.
The classical form of these health records are paper-based health records. They allow
pregnant women access their health data, by meticulously following the progress of
their pregnancy [3]. Hence, pregnant women who can access their health records are
more informed and aware about potential risks while being pregnant, which can help
A Complete Prenatal Solution for a Reproductive Health Unit in Morocco 37
them take better decisions about their health [4]. However, the use of paper-based
health records remains inadequate, since they are exposed to loss, they cannot be
shared with healthcare providers and it is difficult to identify each information included
in them.
Hence, since pregnant women should monitor their health away from the hospital,
there is a need to remotely communicate with the healthcare providers when there is a
necessity [5]. Prenatal mobile Personal Health Records (mPHRs) are useful for these
purposes. They are mobile applications available on the app stores, which can be
installed on smartphones and allow pregnant women consult, record and control their
health data whenever required [6]. Prenatal mPHRs can be connected to Electronic
Health Records (EHRs), which are implemented in hospitals to be used only by
obstetricians, gynecologists or their assistants. EHRs can be either web or desktop
applications accessible from computers.
Facilitating the access to health data, information about the progress of pregnancy
and communication with healthcare providers promotes the improvement of the health
status of pregnant women and their infants, and therefore indirectly lessening the
mortality rates.
As part of a collaboration with the maternity “Les Orangers” in Rabat, a prenatal
mPHR and EHR were developed, based on the specifications that were extracted while
visiting the maternity several times. Afterwards, these applications will be evaluated in
the course of an experiment, which will be conducted among selected pregnant women
and obstetricians or gynecologists who will participate in this experiment.
The remainder of this paper is structured as follows: An overview of prenatal
mobile personal health records is introduced in Sect. 2. The developed solution is
presented in Sect. 3. The implementation of the solution is explained in Sect. 4. Sec-
tion 5 describes the experiment design of the solution. Lastly, Sect. 6 provides con-
clusions and future work.
2 Prenatal Mobile Personal Health Records: An Overview
Prenatal mPHRs are mobile health applications that allow a pregnant woman to access,
record and share her health data with healthcare professionals, in order to carry on an
accurate and consistent monitoring for her health and the baby’s health [7].
According to a previous study [6], these mobile apps generally include features
such as: Calendar and reminders for follow-ups and important appointments, infor-
mation about the progress of pregnancy regarding the mother and the baby’s health,
health habits to be followed during pregnancy, in addition to recorders and counters for
baby kicks and contractions. Furthermore, among the data that should be collected in
the prenatal mPHRs are the pregnant woman’s personal details, the physical body
information (e.g. weight, blood pressure, glucose, heart frequency), her medical
history (e.g. allergies or immunizations) and her obstetrical history (e.g. contraception
methods used, information about previous pregnancies and the status of her current
pregnancy) [7].
38 M. Bachiri et al.
3 A New Prenatal Solution
This Section presents the purpose of the new prenatal solution as well as its require-
ments’ specification.
3.1 Purpose
A prenatal mPHR was developed for the pregnant women to follow up their pregnancy
and stay in touch with their doctors, while having a vision on their personal health
records. This mobile application interacts with the EHR, which is a web application
that will be implemented for the healthcare providers (doctors and assistants). Hence, it
will permit the ease of interaction between the pregnant woman and her doctor.
The doctor’s role is to fill out the health records of his patients during each con-
sultation, to be able to follow their health state through the EHR. The management of
appointments, patients, doctors and their availability are also handled in the EHR.
As for pregnant women, they will be able to access and consult their personal health
records through the prenatal mPHR, in addition to taking appointments according to the
availability of doctors, consulting information about the progress of pregnancy as
regards the mother and the baby, recording contractions, baby kicks and the measured
weight and blood pressure, which will be accessible for doctors via the EHR.
3.2 Requirements Specification

Based on the requirements catalog of prenatal mPHRs conceived in a previous study
[8], in addition to scheduled visits to the maternity “Les Orangers”, by assisting to real-
time consultations and discussing with the medical staff, a set of requirements for both
the prenatal mPHR and EHR were identified.
Hence, the following functional requirements were implemented in the developed
prenatal mPHR:
• Displaying an up-to-date version of the health records received from the EHR.
• Recording the weight and blood pressure measurements that have been taken by the
pregnant woman, which will be available for the doctors through the EHR.
• Consulting the availability of doctors for consultations, received from the EHR, and
taking appointments.
• Entering measurements (Weight/Blood pressure: Lists and graphs by date) and
displaying these measurements to the doctor through the EHR.
• Authenticating by email and password. The password is assigned to each patient at
the creation of their personal health records, and can be changed later by the patient
if needed.
• Consulting information about the progress of pregnancy per week (1–40).
• Contractions counter.
• Baby kicks counter.
• The application can support other languages in the future.
As regards the non-functional requirements mentioned in the previously conceived

catalog [8], the mPHR should meet the Operability, Performance efficiency, Reliability,
Functional suitability, Sustainability and I18n requirements.
Moreover, the following functional requirements were implemented in the devel-
oped EHR:
• Creating a PHR for the pregnant woman.
• Editing the PHR.
• Managing the available days for consultations.
• Consulting the measurements of the weight and blood pressure that were recorded
by the patient from the mPHR.
• Managing patients by adding, editing, deleting and searching.
• Authenticating by email and password.
The creation of the PHR consists in covering the following data:
1. Personal information: Last name and first name, date of birth, blood group, height,
phone number, address, doctor’s name, family situation, age of marriage and
profession.
2. Medical History: Number of children, number of previous pregnancies, chronic
diseases, genetic diseases, contraceptive method used, surgical antecedents, gyne-
cological antecedents and obstetrical antecedents.
3. Current pregnancy: Expected date of delivery, weight before pregnancy and last
menstrual period date.
4. Measurements: Measurement date, weight and blood pressure.
5. Consultation: Consultation’s date, weight, blood pressure, uterine height, obser-
vations, treatment, pelvic exam, breast examination, conjunctive state, echography
and blood analysis.
4 Implementation of the Prenatal Solution
4.1 The Prenatal mPHR

The prenatal mPHR is an Android mobile application developed using the Java lan-
guage, since it is one of the official languages for developing native Android appli-
cations [9]. Native applications have the best performance, are more secure, interactive,
intuitive and allow developers access the full features of the devices.
In order to use the prenatal mPHR, the user is asked first to define a PIN code to
secure the access to the application. Secondly, she is asked to sign in using her email
and the password that has been assigned to her at the creation of her PHR. Once signed
in, the user can access general and diverse weekly information about pregnancy, from
week 1 to week 40, either related to the mother or the baby. She can also consult her
PHR, which was filled in by the doctor or the assistant, via the EHR, including her
personal and medical information. Moreover, the user can visualize a calendar dis-
playing the available days for consultations. Once an available day is selected, the user
can choose a convenient time for the consultation and take an appointment.
Furthermore, the user can record baby kicks and contractions, using recorders that
permit to save the history of the records in the application. For instance, in order to
record contractions, the user marks the start of a session, and every time a contraction
occurs, she clicks on a button to record it until contractions stop, she then ends the
session and the records are automatically saved. Lastly, the user can enter their mea-
sured weight or blood pressure, according to a defined date and time, and then visualize
the progress history of these variables as graphs or lists.
Figure 1 (a–j) in Appendix demonstrates some screenshots of the developed pre-
natal mPHR. The Appendix is accessible via the following link: https://www.um.es/
giisw/prenatal/Appendix.pdf.
4.2 The EHR

The EHR is a web application that was developed using HyperText Markup Language
(HTML), Cascading Style Sheets (CSS) and JavaScript. Google Firebase [10] was used
as well to provide a real-time database and assure authentication of doctors in the EHR,
in addition to hosting the EHR.
Through the EHR, the doctors or assistants can manage the patients’ health records
by consulting, adding, editing and deleting them. Note that deleting a patient’s health
records leads to denying her access to the mPHR. Moreover, the list of doctors can be
managed as well, by adding a new doctor, editing and deleting a specific one. Note that
deleting a doctor leads to denying his access to the EHR. Furthermore, the set of
consultations for each patient are managed, since the doctor can add, edit and delete a
consultation. Doctors can also specify the days of consultations when they are avail-
able, or remove them. The doctors or assistants can consult, in real time, the list of
appointments taken by patients as well. In this list, the date, time, and name of the
patients are indicated. Lastly, the doctors or assistants can consult the weight and blood
pressure measurements recorded by the pregnant women in the mPHR.
Figure 2 (a–f) in Appendix shows some screenshots of the developed EHR. The
Appendix is accessible at the following link: https://www.um.es/giisw/prenatal/
Appendix.pdf.
5 Experiment Design
This section presents the experimental design we will follow to evaluate the quality and
the usefulness of our prenatal solution. Firstly, we present the criteria we will use to
recruit participants (pregnant women and gynecologists). Secondly, we describe the
recruitment process.
5.1 Selection Criteria

The aim of this experiment is to evaluate the quality and potential of both the prenatal
mPHR and EHR. For this purpose, pregnant women and obstetricians/gynecologists
will be recruited in order to carry out this experiment.
A set of inclusion (ICs) and exclusion criteria (ECs) were defined for this selection.
Hence, pregnant women are eligible for the experiment if they:
• are aged between18 and 45 years.
• are resident in either Rabat or Casablanca.
• are currently pregnant.
• have a moderate level of experience with mobile applications.
• own a smartphone that runs a recent version of Android.
• are willing the comply with all study procedures.
Otherwise, pregnant women are ineligible if they:
• are planning to relocate from the study area within the next two years.
• are currently smoking.
• use alcohol and drugs.
• have a sever debilitating illness preventing their participation.
• are infertile.
• are sterilized.
• are in premature or normal menopause.
• are not planning to deliver at public hospitals in Casablanca or Rabat.
• are willing to terminate their pregnancy.
• have history of three or more consecutive pregnancy complications.
As for obstetricians/gynecologists, they are eligible for the experiment only if they
are practicing in public hospitals in Rabat or Casablanca.
The experiment will be then carried out in two phases: (1) Before delivery: The
selected pregnant women will be assigned one of the selected obstetricians/
gynecologists and will be asked to use the developed prenatal mPHR to monitor
their pregnancy. Throughout their use of the prenatal mPHR, they will have to access
and record their own health data (weight, blood pressure, baby kicks and contractions)
and connect with their obstetrician/gynecologist in real-time. Moreover, they will have
to set appointments for consultations. The obstetrician/gynecologist will guide the
pregnant women all along the progress of their pregnancy until their due date, and will
be at disposal if any complication occurs. (2) After delivery: After giving birth to the
baby and getting enough rest, the new mother will be asked to fill in questionnaires in
order to evaluate the potential and quality of the prenatal mPHR.
5.2 Recruitment Process

In this experiment, the enrollment of pregnant women and obstetricians/gynecologists
will target local public hospitals in Rabat and Casablanca to directly recruit potential
subjects. Hence, five obstetricians/gynecologists are expected to be enrolled in this
experiment to fulfill reproductive health needs of the recruited pregnant women,
through the EHR and the scheduled appointments.
Obstetricians/gynecologists will be guided by providing them with a detailed

overview of the conduct of the experiment. An informed consent form will be given in
advance to the obstetricians/gynecologists to guide the recruitment discussion.
Therefore, after obtaining their consent to be part of the study, the obstetricians/
gynecologists will be selected for the experiment.
As regards pregnant women, the targeted number of participants is approximately
50. For this purpose, two research assistants will be in charge of the recruitment
process. First, flyers and posters will be distributed over public hospitals in both Rabat
and Casablanca to attract interested subjects, who can, thereafter, contact the research
team for further details about the experiment if needed.
Hence, if they seem willing to participate, they will be given a written informed
consent document to cautiously read, understand and sign.
Moreover, detailed information about the purpose of the study, the procedures to be
followed, the risks and discomforts as well as potential benefits associated with par-
ticipation will be explained to them. Furthermore, they will be invited to an interview
during which eligibility criteria will be evaluated. Participants who meet these criteria
will be then enrolled to the experiment. Afterwards, a meeting will be held to gather the
registered participants and help them download and install the developed prenatal
mPHR on their smartphones. Moreover, the main features and functionalities will be
briefly presented to the participants, in addition to instructions about the conduct of the
experiment.
In pursuance of facilitating the access to reproductive healthcare services for pregnant

women, a prenatal mPHR and EHR have been developed, according to specifications
based on our previous studies on prenatal health services [8, 11–16] and scheduled
visits to the maternity “Les Orangers” in Rabat. The prenatal mPHR is intended to be
used by pregnant women, while the EHR will be used by obstetricians/gynecologists or
their assistants. Hence, an experiment is expected to be conducted among pregnant
women and obstetricians/gynecologists, in order to assess the quality of these solutions
and evaluate their potential as regards improving the reproductive healthcare services
for tracking pregnancy. Moreover, an iOS version of the mPHR is intended to be
developed in order to target a large number of users.
Acknowledgments. This work was conducted within the research project PEER 7-246 sup-
ported by the US Agency for International Development (USAID). The authors would like to
thank the NAS and USAID for their support.
References
1. Gu, B.D., Yang, J.J., Li, J.Q., Wang, Q., Niu, Y.: Using knowledge management and
mhealth in high-risk pregnancy care: a case for the floating population in China. In:
Proceedings of the IEEE 38th Annual International Computer Software Applications
Conference Workshops COMPSACW 2014, pp. 678–683 (2014)
2. Oh, S., Sheble, L., Choemprayong, S.: Personal pregnancy health records (PregHeR): facets
to interface design. Proc. Am. Soc. Inf. Sci. Technol. 43(1), 1–10 (2007)
3. Hoang, D.B., et al.: Assistive care loop with electronic maternity records. In: 2008 10th
IEEE International. Conference e-Health Networking, Applications Services Health,
pp. 118–123 (2008)
4. Homer, C.S.E., Davis, G.K., Everitt, L.S.: The introduction of a woman-held record into a
hospital antenatal clinic: the bring your own records study. Aust. N. Z. J. Obstet. Gynaecol.
39(1), 54–57 (1999)
5. Shaw, E., et al.: Access to web-based personalized antenatal health records for pregnant
women: a randomized controlled trial. J. Obstet. Gynaecol. Can. 30(1), 38–43 (2008)
6. Bachiri, M., Idri, A., Fernández-Alemán, J.L., Toval, A.: Mobile personal health records for
pregnancy monitoring functionalities: analysis and potential. Comput. Methods Programs
Biomed. 134, 121–135 (2016)
7. Idri, A., Bachiri, M., Fernández-Alemán, J.L.: A framework for evaluating the software
product quality of pregnancy monitoring mobile personal health records. J. Med. Syst. 40(3),
50 (2016)
8. Bachiri, M., Idri, A., Redman, L.M., Fernández-Alemán, J.L., Toval, A.: A requirements
catalog of mobile personal health records for prenatal care. In: Lecture Notes in Computer
Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in
Bioinformatics, vol. 11622 LNCS, pp. 483–495 (2019)
9. Application Fundamentals. https://developer.android.com/guide/components/fundamentals.
Accessed 07 Nov 2019
10. Google Firebase. https://firebase.google.com/. Accessed 06 Nov 2019
11. Sardi, L., Idri, A., Redman, L.M., Alami, H., Bezad, R., Fernández-Alemán, J.L.: Mobile
health applications for postnatal care: review and analysis of functionalities and technical
features. Comput. Methods Programs Biomed. 184, 105114 (2020)
12. Bachiri, M., Idri, A., Abran, A., Redman, L.M., Fernández-Alemán, J.L.: Sizing prenatal
mPHRs using COSMIC measurement method. J. Med. Syst. 43(10), 319 (2019)
13. Bachiri, M., Idri, A., Redman, L.M., Abran, A., de Gea, J.M.C., Fernández-Alemán, J.L.:
COSMIC functional size measurement of mobile personal health records for pregnancy
monitoring. Adv. Intell. Syst. Comput. 932, 24–33 (2019)
14. Idri, A., Bachiri, M., Fernández-Alemán, J.L., Toval, A.: Experiment design of free
pregnancy monitoring mobile personal health records quality evaluation. In: 2016 IEEE 18th
International Conference on e-Health Networking, Applications and Services (Healthcom),
pp. 1–6 (2016)
15. Bachiri, M., Idri, A., Fernández-Alemán, J.L., Toval, A.: Evaluating the privacy policies of
mobile personal health records for pregnancy monitoring. J. Med. Syst. 42(8), 144 (2018)
16. Bachiri, M., Idri, A., Fernández-Alemán, J.L., Toval, A.: A preliminary study on the
evaluation of software product quality of pregnancy monitoring mPHRs. In: Proceedings of
2015 IEEE World Conference on Complex Systems WCCS 2015 (2016)
Machine Learning and Image Processing
for Breast Cancer: A Systematic Map
Hasnae Zerouaoui1, Ali Idri1,2(&), and Khalid El Asnaoui1

1
Mohammed VI Polytechnic University, Ben Guerir, Morocco
{Hasnae.zerouaoui,Khalid.elasnaoui}@um6p.ma
2
Mohammed V University in Rabat, Rabat, Morocco
[email protected]
Abstract. Machine Learning (ML) combined with Image Processing (IP) gives
a powerful tool to help physician, doctors and radiologist to make more accurate
decisions. Breast cancer (BC) is a largely common disease among women
worldwide; it is one of the medical sub-field that are experiencing an emergence
of the use of ML and IP techniques. This paper explores the use of ML and IP
techniques for BC in the form of a systematic mapping study. 530 papers
published between 2000 and August 2019 were selected and analyzed according
to 6 criteria: year and publication channel, empirical type, research type, medical
task, machine learning objectives and datasets used. The results show that
classification was the most used ML objective. As for the datasets most of the
articles used private datasets belonging to hospitals, although papers using
public data choose MIAS (Mammographic Image Analysis Society) which
make it as the most used public dataset.
Keywords: Breast cancer Machine learning Image processing Systematic

mapping study
1 Introduction
One of the most common cancers for women in the world is Breast Cancer. It happens
when the cell tissue of the breast cells grows abnormally and start to divide rapidly [1].
The BC disease is distinguished by an overgrowth of a malignant tumor in the breast
[2]. The goal of BC screening is to achieve an early diagnosis, which aims to discern
the Malignant and Benign tumor, as for prognosis helps to put a treatment plan. The
use of medical image processing and machine learning for breast cancer diagnosis,
prognosis and/or treatment is promising since it can help physicians, doctors and
experts in detecting efficiently abnormalities [3].
To the extent of the authors’ knowledge, no Systematic Mapping Study (SMS) was
carried out to summarize the findings of primary studies dealing with the use of
machine learning and image processing techniques for any breast cancer medical tasks
such as diagnosis, prognosis and treatment. However, Idri et al. [4] have carried out a
SMS on the use of data mining techniques in BC, and Hosni et al. [5] conducted a SMS
Machine Learning and Image Processing for Breast Cancer 45
on the use of ensemble techniques in breast cancer. The present SMS searches the
primary studies dealing with the application of machine learning and image processing
for BC published between 2000 to August 2019 in six libraries: ScienceDirecte,
IEEEXPLORE, Pubmed, Springer, ACM and Google Scholar. It provides a synthesis
and a summary of 530 selected papers by means of six Research Questions (RQs):
(1) determine the year, publication channels and sources of the selected papers,
(2) identify the type of contributions and empirical methods, (3) examine the most used
machine learning objective, (4) discover the datasets employed for ML and IP for BC.
The paper is structured as follow: Sect. 2 describes the research methodology
followed by this review. Section 3 reports the results of the four RQs. Section 4
discusses the results obtained. Section 5 concludes this SMS.
2 Research Methodology
The purpose of a systematic mapping study is to offer an overview of a research area by

identifying the research type and quantity of a research field and to describe broadly the
methodologies and results of primary studies [6]. A SMS involves 5 steps which are:
Defining the research questions, searching for relevant papers, screening the selected
papers, keywording of abstract and data extracting and mapping the results.
2.1 Research Questions

The main goal of this paper is to provide an overview of the studies published from
2000 to August 2019 in the field of machine learning and image processing techniques
applied to breast cancer. Therefore, we identify four research questions and their
motivations as shown in Table 1.
Table 1. Research questions

N# Research question Motivation
RQ1 In which year, publication channels and Identify the publication trends, and the
sources were the selected papers related different publication channels and
to machine learning and image sources of the papers selected
processing in breast cancer published?
RQ2 What type of contributions and Identify the different type of studies
empirical methods is being made to the performed in ML and IP applied to BC
area of machine learning and image
processing in breast cancer?
RQ3 Which is the most investigated machine Discover the most investigated ML
learning objective? objective in BC literature
RQ4 What are the most used datasets for ML Identify the most used datasets
and IP in BC?
46 H. Zerouaoui et al.
2.2 Search Strategy

To formulate the search string, we used the principal key words and their synonyms
extracted from the research questions. The Boolean AND was used to join the
important parts and the Boolean OR was used to join alternative words. The finale
search string was defined as followed:
(Breast OR “Mammary gland”) AND (cancer* OR tumor OR malignancy OR
masses) AND (Prognosis OR Predict* OR Diagnosis OR Identification OR Analy-
sis OR monitoring OR treatment) AND (“data mining” OR intelligent OR classificat*
OR cluster* OR associat* OR predict* OR “machine learning” OR “deep learning”)
AND (model* OR algorithm* OR technique* OR rule* OR method* OR tool* OR
framework*) AND (mammogr* OR ultrasound OR thermogra* OR “magnetic reso-
nance imaging” OR tomosynthesis OR tomography OR imag* OR “image processing”
OR “medical images” OR “computer vision”).
We search the relevant papers in six digital libraries: Science Direct, IEEEX-
PLORE, PubMed, ACM, Springer and Google Scholar. These libraries offer many
candidate papers, furthermore they index several journals, conferences, and books
addressing the topic of this study.
2.3 Study Selection

In order to select the relevant papers for our SMS, we identified a set of inclusion and
exclusion criteria (ICs/ECs) combined by the OR Boolean operator. The ICs/ECs we
used are tabulated in Table 2. Three authors evaluated the candidate papers using these
ICs/ECs to decide on including or excluding each paper; in case of a disagreement, a
meeting took place between the three authors to reach a final decision.
Table 2. Inclusion and exclusion criteria

Inclusion criteria Exclusion criteria
IC1: papers using new or proposing existing ML and IP EC1: Papers written in other
techniques for BC languages than English
IC2: papers presenting an overview on the use of ML and EC2: Papers dealing with others
IP techniques in BC cancer types
IC3: papers providing empirical/theoretical comparisons EC3: duplicated papers
of ML and IP techniques in BC EC4: Short papers with only (2–3
IC4: papers published between 2000 and later than 2019 pages)
EC5: Presentations or posters
2.4 Data Extraction Strategy and Synthesis

After selecting the relevant papers, we followed a form to extract the relevant data from
the selected studies in order to answer the four RQs of Table 1. RQ1: Involves the
identification of the year of publication, the publication channel (Journal, conference,
book section), and publication source. RQ2: Identification of the research types [7]:
Evaluation Research (ER), Solution Proposal (SP), Experience Papers, Review, Case
study, Survey, and Historical based evaluation. RQ3: Identification of the machine
learning objectives such as: classification, clustering, prediction and others. RQ4:
Identification of the datasets employed [4].
2.5 Threats to Validity

The main threats of validity for this study are presented below.
Study Selection Bias: To choose the relevant papers for this study, we established a
search string that contain all the important keywords to cover the maximum of primary
studies from the digital libraries that were used (Science Direct, IEEEXPLORE,
PubMed, ACM, Springer and Google Scholar). To prevent excluding relevant papers,
selection criteria were defined to rigorously match the RQs.
Data Extraction Bias: Data extraction is a crucial step in the process of the SMS, any
inaccuracy may lead to incorrect results. Therefore, the extracted data is validated by
the three authors. in case of a disagreement a discussion took place between the three
authors to finally decide if the paper is relevant for the study or not.
3 Results
This section presents an overview of the selection process results. Thereafter, we

present the results of each RQ.
3.1 Studies Selection

As shown in the Fig. 1, 5817 candidate papers were extracted using the search string
on the 6 digital libraries. When applying the exclusion criteria on the titles, keywords
and eventually the abstracts of the candidate paper, 5028 papers were discarded. Then,
we apply on the 789 remaining studies the inclusion criteria and we obtain 530 selected
studies. The list of these 530 papers including all required information to answer the
RQs of this SMS is available upon request by email to the authors of this study.
Fig. 1. Selection process

3.2 RQ1: In Which Year, Publication Channels and Sources Were

the Selected Papers Related to Machine Learning and Image
Processing in Breast Cancer Published?
In Fig. 2 we can identify the number of the selected papers extracted from 2000 to
August 2019. 71% of the papers were published in journals, as 27% were presented in
conferences and only 1% is presented as book chapters. The most frequent journals are
Expert Systems with Applications, Computers in Biology and Medicine, Computer
Methods and Programs in Biomedicine, IEEE Access and Scientific reports. The most
recurrent conferences are RACS Proceedings of the ACM Symposium on Research in
Applied Computation, International Symposium on Biomedical Imaging (ISBI), IEEE
International Conference on Big Data (Big Data), IEEE International Conference on
Bioinformatics and Biomedicine (BIBM) and Image processing. In Fig. 2 we observe
that the number of papers published before 2015 was very low compared with the
number of papers published from 2015 to 2019.
Fig. 2. Number of papers published per year and publication channel
3.3 RQ2: What Type of Contributions and Empirical Methods is Being

Made to the Area of Machine Learning and Image Processing
in Breast Cancer?
We identify three main research types in our SMS: Evaluation research (ER), Solution
Proposal (SP) and Review. 59% of the selected papers were SP proposing new or
improving existing machine learning techniques based on image processing for breast
cancer. 66% of the SP were also evaluated and classified as being ER studies, and only
34% papers proposed new ML techniques without evaluation. 31% of the papers were
classified as ER for comparing or evaluating existing Ml techniques. 10% of selected
papers were reviews. The evolution of the research type of the selected papers over
Fig. 3. Evolution of research types identified over the years
years is presented in Fig. 3. We note that SP and ER come into sight in 2000 and rise
over the years. Also reviews started to get more interested from 2017.
The selected papers were empirically evaluated using three types of empirical
evaluation: case study, historical based evaluation and survey [8]. As shown in Fig. 4,
most of the solution proposal articles used historical based evaluation by using publicly
available databases. For the evaluation research, most of the papers used a case study
based empirical evaluation, and for the review they used survey empirical method. It’s
noticed that researchers started to give more importance to reviews for the large number
of articles published in the subject and the importance of information that needs to be
summarized, hence the importance of this research type.
Fig. 4. Distribution of research types and empirical types

3.4 RQ3: Which is the Most Investigated Machine Learning Objective?

The aim of the RQ3 is to discover the most investigated machine learning objective
with image processing in Breast cancer. Figure 5 shows the distribution of the ML
objectives. We observe that 89% of the selected papers dealt with classification which
consists of classifying the tumor in malign or benign, while 6% treated the prediction
objective, 4% dealt with clustering and only 1% for association.
Fig. 5. Distribution of machine learning objectives
3.5 RQ4: What are the Datasets Used for ML and IP in BC?
The aim of RQ6 is to identify the different datasets, the validation methods and the
performance measures used to evaluate the use of machine learning and image pro-
cessing in breast cancer. Table 3 shows the most used datasets in the 530 selected
papers selected. It can be noticed that 47% of datasets are private, MIAS is used by
15% of selected studies, Digital Database for Screening Mammography (DDSM)
(13%), Breast Cancer Histopathological (BREAKHIS) (5%), Breast Cancer Digital
Repository (BCDR), WISCONSIN and INBREAST (3% each), Mytos (2%), 1% The
Cancer Genome Atlas) TCGA. The remaining articles used other databases such as
IMAGENET, ICIAR, Camelyon challenge, BUS, IRMA, and AMIDA.
Table 3. Datasets used
Datase t N# of pape rs Datase t N# of pape rs

Private 237 WISCONSIN 13
MIAS 74 INBREAST 13
DDSM 64 MYT OS 11
BREAKHIS 23 T CGA 6
BCDR 16 OT HERS 42
4 Discussion
This section discusses the results of the 6 research questions of Table 1.
4.1 RQ1: In Which Year, Publication Channels and Sources Were

the Selected Papers Related to Machine Learning and Image
Processing in Breast Cancer Published?
From Fig. 2, it is noticeable that the number of publications significantly increased in
2016, since ML and IP are becoming an important issue and are increasingly used by
researchers in the medical field, particularly in breast cancer. This is due to the
effectiveness of ML and IP techniques in improving the performance of medical
decisions. The selected papers were published in types of channels: Journals, confer-
ences and book chapters. Furthermore, we notice that 71% of the papers were pub-
lished in journals which reflect the importance of the research and the good scientific
level of maturity, since it is in general more difficult to publish in journals than in
conferences and symposiums. For the sources of publications, there is no specific
publication source, but different ones were targeted such as medicine, computer science
applied to medicine, computer science and artificial intelligence, and this is due to the
multidisciplinary of the field ML and IP applied to breast cancer.
4.2 RQ2: What Type of Contributions and Empirical Methods is Being

Made to the Area of Machine Learning and Image Processing
in Breast Cancer?
The 530 selected papers can be classified in three types: evaluation research, solution
proposal and review. We notice that the solution proposal papers used evaluation
techniques to measure the performance of the proposed methods, the use of SP by
researchers is due to the fact that the domain of ML and IP for BC still needs new and
more effective solutions to offer better results. Also, the fact that most of the SP papers
evaluated their techniques proves a good scientific maturity. As for the papers pre-
senting a review gained more interested since 2017 due to the fact that the number of
primary studies became important and therefore the need of synthesize and summarize
their findings. For the empirical type, evaluation of solution proposal (SP) was in
general done using historical data, since researchers choose to test their newly devel-
oped technique on publicly available databases; this comes down to the privacy of the
data and the difficulty of collecting data from hospitals. As regards, the evaluation
research (ER) studies used in general case study as an empirical type to test existing
techniques on new databases; note that most of the ER studies were in collaboration
with hospitals.
4.3 RQ3: Which is the Most Used Machine Learning Objective?

Figure 5 shows that 434 articles investigated the classification objective, 32 papers
were for prediction (regression), 19 for clustering, and the remaining 5 articles
investigated the association objective [7–11]. The use of classification methods is
explained by the fact that image processing steps include image preprocessing, seg-
mentation, feature extraction, feature selection and classification [12]. Therefore,
classification is an important step in IP for classifying properly the medical images to
detect the type of the tumor.
4.4 RQ4: What are the Datasets Used for ML and IP in BC?
Table 3 shows that 47% of the selected papers used private datasets collected from
hospitals; this is due to the privacy of the medical images and the fact that not all patients
want to share their medical images. Researchers are then encouraged to collaborate with
clinics and medical centers to collect the required images to evaluate and their BC
solutions. Moreover, the most used public datasets are MIAS (28%) and DDSM (25%)
for mammographic images due to the fact mammographic images are still the most used
for BC diagnosis [13–15]; Breakhis (9%) for histopathological images; and Wisconsin
(5%), Inbreast (5%) and BCDR (6%) for other medical imaging types. We note that
some studies used several datasets to compare their results [10, 16–19].
The purpose of this SMS was to present an overview of the use of ML and IP in breast
cancer. 530 papers published from 2000 to August 2019 were selected and classified
according to: year and source of publication, research type and empirical type, BC
discipline, ML methods and techniques, validation techniques and performance mea-
sures. This paper discussed the results of the six RQs. The findings per RQ are: (RQ1)
The use of ML and IP for BC is gaining more interest in the last years by researchers,
the number of published articles has increased significantly since 2015 and the majority
of the papers (71%) were published in journals. (RQ2) Most of the relevant papers were
identified as solution proposal and evaluation research, and the majority of the articles
used historical based evaluation. (RQ4) Classification is the most investigated objective
in ML and IP for BC, and that is explained by the fact that classification is a component
of any IP process. (RQ6) Private datasets are the most frequently used to evaluate ML
and IP for BC, followed by two public datasets MIAS and DDSM.
As future work we aim to: (1) use the results of the SMS study as the base to
perform a systematic literature review concerning the use of ML and IP in Breast
cancer, and (2) conduct an evaluation research using case study data collected from a
Moroccan hospital to investigate the performance of different ML and IP techniques.
References
1. Metelko, Z., et al.: Pergamon the world health organization quality of life assessment. 41(10)
(1995)
2. Bish, A., Ramirez, A., Burgess, C., Hunter, M.: Understanding why women delay in seeking
help for breast cancer symptoms B. J. Psychosom. Res. 58, 321–326 (2005)
3. Zhang, G., Wang, W., Moon, J., Pack, J.K., Jeon, S.I.: A review of breast tissue
classification in mammograms. In: Proceedings of the 2011 ACM Research in Applied
Computation Symposium, RACS 2011, pp. 232–237 (2011)
4. Idri, A., Chlioui, I., El Ouassif, B.: A systematic map of data analytics in breast cancer. In:
ACM International Conference. Proceeding Series (2018)
5. Hosni, M., Abnane, I., Idri, A., Carrillo de Gea, J.M., Fernández Alemán, J.L.: Reviewing
ensemble classification methods in breast cancer. Comput. Methods Programs Biomed. 177,
89–112 (2019)
6. Kofod-petersen, A.: How to do a structured literature review in computer science.
Researchgate, no. May 2015, pp. 1–7 (2014)
7. Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J., Linkman, S.:
Systematic literature reviews in software engineering - a systematic literature review. Inf.
Softw. Technol. 51(1), 7–15 (2009)
8. Tonella, P., Torchiano, M., Du Bois, B., Systä, T.: Empirical studies in reverse engineering:
state of the art and future trends. Empir. Softw. Eng. 12(5), 551–571 (2007)
9. Rampun, A., Wang, H., Scotney, B., Morrow, P., Zwiggelaar, R.: School of Computing,
Ulster University, Coleraine, Northern Ireland, UK Department of Computer Science,
Aberystwyth University, UK. In: 2018 25th IEEE International Conference Image
Processing, pp. 2072–2076 (2018)
10. Agarap, A.F.M.: On breast cancer detection: an application of machine learning algorithms
on the Wisconsin diagnostic dataset. In: ACM International Conference. Proceeding Series,
no. 1, pp. 5–9 (2018)
11. Xiong, X., Kim, Y., Baek, Y., Rhee, D.W., Kim, S.H.: Analysis of breast cancer using data
mining & statistical techniques. In: Proceedings of the Sixth International Conference on
Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Comput-
ing and First ACIS International Workshop on Self-assembling Wireless Network,
SNPD/SAWN 2005, vol. 2005, pp. 82–87 (2005)
12. Sadoughi, F., Kazemy, Z., Hamedan, F., Owji, L., Rahmanikatigari, M., Azadboni, T.T.:
Artificial intelligence methods for the diagnosis of breast cancer by image processing: a
review. Breast Cancer Targets Ther. 10, 219–230 (2018)
13. Wei, X., Ma, Y., Wang, R.: A new mammography lesion classification method based on
convolutional neural network. In: ACM International Conference. Proceeding Series,
pp. 39–43 (2019)
14. Ting, F.F., Tan, Y.J., Sim, K.S.: Convolutional neural network improvement for breast
cancer classification. Expert Syst. Appl. 120, 103–115 (2019)
15. Torrents-Barrena, J., Puig, D., Melendez, J., Valls, A.: Computer-aided diagnosis of breast
cancer via Gabor wavelet bank and binary-class SVM in mammographic images.
J. Exp. Theor. Artif. Intell. 28(1–2), 295–311 (2016)
16. Hu, Z., Tang, J., Wang, Z., Zhang, K., Zhang, L., Sun, Q.: Deep learning for image-based
cancer detection and diagnosis – a survey. Pattern Recogn. 83, 134–149 (2018)
17. Mini, M.G.: Neural network based classification of digitized mammograms. In: Proceedings
of the 2nd Kuwait Conference on e-Services e-Systems, KCESS 2011, pp. 1–5 (2011)
18. Hamidinekoo, A., Dagdia, Z.C., Suhail, Z., Zwiggelaar, R.: Distributed rough set based
feature selection approach to analyse deep and hand-crafted features for mammography mass
classification. In: Proceedings of the 2018 IEEE International Conference on Big Data, Big
Data 2018, pp. 2423–2432 (2019)
19. Mendel, K., Li, H., Sheth, D., Giger, M.: Transfer learning from convolutional neural
networks for computer-aided diagnosis: a comparison of digital breast tomosynthesis and
full-field digital mammography. Acad. Radiol. 26(6), 735–743 (2019)
A Definition of a Coaching Plan to Guide
Patients with Chronic Obstructive Respiratory
Diseases
Diogo Martinho(&) , Ana Vieira , João Carneiro ,

Constantino Martins , Ana Almeida , and Goreti Marreiros
Research Group on Intelligent Engineering and Computing for Advanced

Innovation and Development (GECAD), Institute of Engineering,
Polytechnic of Porto, Porto, Portugal
{diepm,aavir,jomrc,acm,amn,mgt}@isep.ipp.pt
Abstract. With such a noticeable increase in the number of people with chronic
obstructive respiratory diseases the effectiveness of traditional healthcare sys-
tems has worsened significantly over the last years. There is an opportunity to
develop low cost and personalized solutions that can empower patients to self-
manage and self-monitor their health condition. In this context, the PHE project
is present whose main goal is to develop coaching solutions for remote moni-
toring of patients and that can be provided through the exclusive use of the
smartphone. In this work we explore how patients with chronic obstructive
respiratory diseases can adopt healthier behaviors by following personalized
healthcare coaching plans used throughout their daily lives. We explain how a
coaching plan can be defined to guide the patient and explore the mechanisms
necessary to operate automatically and adapt itself according to the interactions
between the patient and the system. As a result, we believe to be possible to
enhance user experience and engagement with the developed system and con-
sequentially improve his/her health condition.
Keywords: CORD mHealth Personal healthcare Self-monitoring
1 Introduction
Chronic obstructive respiratory diseases (CORD) affect a large percentage of the

world’s population and is already the third leading cause of death in the world [1, 2].
Furthermore, is estimated that just in Europe, the cost of respiratory diseases exceeds
the €380 billion [3]. Moreover, CORDs are progressive and worsen over time. This
means that there is a high prevalence of CORD throughout a person’s life cycle
(asthma starting in early ages and other chronic obstructive pulmonary diseases are
detected from the middle-age onwards). The progressive deterioration of CORDs often
leads to frequent exacerbations, which in turn results in frequent hospital admissions.
As such, patients require regular medical consultations and constant monitoring of their
health throughout their daily lives. Health care in the context of CORD management
has traditionally been provided through either face-to-face interventions between the
A Definition of a Coaching Plan to Guide Patients 55
patient and the healthcare professional, separated by periods without structured support
or by the use of self-monitoring tools (such as flow meters, handheld spirometers,
oximeters) and self-management tools (such as symptom diaries, manuals, pamphlets
and web resources) between consultations. The reality, however, is that the constant
monitoring of patients’ condition has become a burden on the healthcare providers [4]
and traditional healthcare delivered through health professionals’ face-to-face interac-
tions becomes more difficult to achieve. As such, the necessity to develop cost-effective
solutions to monitor and treat patients with CORD has increased significantly in recent
years [5]. In this scope, concepts such as mobile health (mHealth) have emerged
towards the self-management of the patient’s disease, by providing mobile systems that
are capable of monitoring patients’ health status and giving customized feedback about
activities and behaviors that can be done to improve health and wellbeing [6, 7].
Furthermore, mobile devices now offer a wide set of features and embedded sensors
and the development of solutions that can exploit these components without or with
minimal access to external devices other than the smartphone itself seem to be adequate
and easy to integrate in the daily lives of patients to measure and monitor patients’
current health condition and support them in the management of their diseases [8].
Therefore, coaching solutions delivered through smartphones (mCoaching) that can
combine data gathering and processing, gamification elements for user engagement and
support to behaviors change seem to be an ideal platform to deliver both simple and
effective self-management interventions, while maintaining or improving quality of
care and reducing costs, specially in the context of CORD management [9–12].
The work here proposed is part of the PHE project1 which aims to empower people
to monitor and improve their health using personal data and technology assisted
coaching. To achieve this goal, PHE will apply innovative and intelligent measuring
and monitoring tools for preventive healthcare and allow cost-saving and self and
home-care solutions with increased patient involvement. Furthermore, PHE project will
exclusively use the smartphone and its embedded sensors to acquire all the necessary
data to provide personalized support to the CORD patient. In this work we explore the
personalization given to the CORD patient by providing him/her a coaching plan to
follow and to adopt healthier behaviors throughout his/her daily life. A conceptual
definition of the coaching plan is presented which includes four different phases of
operation (initialization, execution, completion and post completion). We describe each
of these phases and explain how the coaching plan can enhance the personalized
healthcare provided to the patient and define a proactive mechanism which is not
completely dependent on user input but also capable of adapting itself based on the data
collected over time while the patient uses and interacts with the PHE system.
1
https://itea3.org/project/personal-health-empowerment.html.
56 D. Martinho et al.
2 Proposed Model
The work here proposed has been extended from [13] in which an architecture for the
coaching module to support self-monitoring of CORD patients was defined. This
coaching module is responsible for processing patient data and generate recommen-
dations to improve patient’s health condition accordingly. Furthermore, and as will be
explained, the proposed model can operate independently from the PHE system due its
generic structure. Three main type of users have been identified which interact with the
PHE system: patient, healthcare professional and health manager. The first user is the
main user and will interact with the developed system by inserting clinical information
and receiving recommendations to adopt healthier behaviors and improve health
condition and wellbeing. The healthcare professional can access patient clinical
information and provide specific guidelines (through coaching plans). The health
manager can access and update available domain knowledge (which includes rules and
associated variables, recommendations, user profiles and non-specific coaching plans).
In this section, we first describe the architecture of the defined coaching module
considered for the CORD Management in the PHE system and associated components.
The Coaching Plans component is then discussed in more detail as it represents the
novel feature proposed in this work.
2.1 PHE Coaching Module

According to Fig. 1, three main layers have been identified for the considered archi-
tecture: Service Layer, Business Layer and Data Access Layer. Within the Service
Layer, a Web API has been developed to provide a set of services that can be accessed
internally within the PHE system, but also externally by other systems. The Business
Layer includes four main components which combined allow the definition and model
of knowledge regarding a certain domain. The Rules component specifies the set of
conditions associated to the patient’s clinical data and that are necessary to identify
possible recommendations to send to the patient. These conditions require the
Fig. 1. Coaching module architecture for self-monitoring of CORD

validation of different health variables, which in the case of this work correspond to
both patient demographic data and health state (for example, gender and weight, smoke
exposure, etc.). Besides that, each health variable has an associated periodicity and to
measure/collect its current value a mechanism has to also be defined to promote a
specific interaction between the patient and the system (for example, to know if an
exacerbation was detected within the last week, the associated health variable has to be
updated weekly using an interaction mechanism such as a visual notification). The
definition of each rule and corresponding recommendation is structured in a clinical
matrix format and are based on scientific evidence. Figure 2 shows an example of a
rule that was defined for a recommendation to send to the patient.
Fig. 2. Rule example for CORD
User Profiles specifies all the characteristics that can identify a certain profile which
is assigned to the patient. So far two main groups have already been identified (Asthma
and Rhinitis). Furthermore, the remaining groups will be defined using clustering
techniques to identify users sharing characteristics related to patient’s demographic
data, context, etc. The Coaching Plans component includes the selected recommen-
dations to be provided to the patient in a given time frame. Furthermore, each coaching
plan is related to a specific health topic which has been identified according to the
literature on clinical evidence and medical guidelines. This component will be dis-
cussed in the following subsection of this work. The recommendations component
verifies and processes the received data according to the defined Rules, User Profiles
and Coaching Plans and selects suitable recommendations. The Data Access Layer
serves as a middle layer between the Business Logic Layer and the different data
sources and controls all the read, insert, update and delete operations on the database.
The database contains information regarding the patient’s clinical data, health variables
associated with recommendations, the history of provided recommendations and
respective feedback. It also contains knowledge provided by health professionals, rules
used for the generation of recommendations, the defined user profiles and respective
characteristics, and the coaching plans provided to each patient. The coaching module
has been developed using JBoss Drools framework as it provides intuitive rule lan-
guage for non-developers, supports flexible and adaptive process, enhances intelligent
process automation and complex event processing and is easy to integrate with web
services.
2.2 Coaching Plan

According to the literature [1], and in the context of CORD, coaching plans refer to
different topics related with the management of the disease. In the case of this work, we
have identified 15 minor and 5 major topics through the study of current medical
guidelines and clinical evidence to drive individualized coaching. For that we con-
sidered the results published in American College of Sports Medicine, American
College of Rheumatology, Allergic Rhinitis and its Impact on Asthma, British Thoracic
Society, The Association of Chartered Physiotherapists in Respiratory Care, Australian
and New Zealand guidelines for the management of chronic obstructive pulmonary
disease, Global Initiative for Asthma, Global Initiative for Chronic Obstructive Lung
Disease, Royal Dutch Society for Physical Therapy, National Asthma Education and
Prevention Program, National Institute for Health and Care Excellence, Direção-Geral
da Saúde. Norma sobre Diagnóstico e Tratamento da Doença Pulmonar Obstrutiva
Crónica, Portuguese Ministry of Health and U.S. Department of Health and Human
Services. Each considered topic is presented in Table 1.
Table 1. Topics for CORD management

Major topic Minor topic(s)
Chronic Respiratory Symptoms
Diseases
Concomitant Diseases Respiratory Infections; Sleep Disorders; Rhinitis; Food Allergy
Exposition to External Smoking Habits; Occupational Hazards; Allergens
Agents
Non-pharmacological Physical Activity and Exercise; Breathing Exercises and Airway
Therapies Clearance Techniques
Pharmacological Adherence and Inhaler Techniques; Devices and Active
Therapies Principles; Vaccinations
Other Anxiety; Depression; Stress; Nutrition
Four steps have been identified to define a coaching plan in the context of the PHE
system: Plan Initialization, Plan Execution, Plan Completion and Plan Post Completion.
Fig. 3. Manual coaching plan and goal definition
Plan Initialization. The coaching plan initialization is a process that can configured
manually or automatically by the user. Manual coaching plans are defined either by the
healthcare professional or the health manager and differ by the fact that they can target
a specific patient (coaching plans created by the healthcare professional) or not
(coaching plans created by the health manager). Automatic coaching plans are created
by the patient himself/herself and are based on the coaching plans defined for the
associated user profile.
As can be seen in Fig. 3, coaching plan has an associated periodicity which can be
weekly, monthly or non-repetitive. The user must then select the topics and intended
goals to be achieved with the coaching plan. We define a goal as a desired state
regarding a specific patient-related variable according to a certain topic. For example,
in the context of smoking habits, one objective could be to decrease the number of
cigarettes smoked per day. Furthermore, to achieve a certain goal a list of intermediate
goals can also be defined. Following the given example, intermediate goals which
would allow the patient to decrease the number of cigarettes smoked per day could be
to start the coaching plan and smoke a maximum of 3 cigarettes in the morning, 3
cigarettes in the afternoon and finally 3 cigarettes in the evening/night. This means that
when defining a goal and its associated intermediate goals, the user should also define a
deadline to achieve each identified goal. The flowchart presented in Fig. 4 shows the
coaching plan initialization process that was described.
Fig. 4. Manual coaching plan initialization flowchart
Plan Execution. In the second step, the coaching plan is put into practice and the
targeted patient is monitored according to the goals identified. As such, all patient-
related variables considered for the coaching plan are collected through patient inter-
action with the PHE system by having the patient insert new records and values for
those variables. The coaching framework will process those values and whenever a
recommendation is verified (if those values trigger all the conditions necessary to
activate a recommendation) it will be sent to the smartphone and provided to the patient
in different formats (such as an alert or a notification). In parallel, the coaching
framework will also verify if any goal established for the coaching plan was achieved
and update the coaching plan accordingly. The flowchart presented in Fig. 5 shows the
coaching plan execution process that was described.
Fig. 5. Coaching plan evaluation flowchart
The ideal time to provide recommendations to the patient will depend on the feedback
provided while using the developed system. Several feedback mechanisms are defined to
identify the best moments during the day to provide recommendations to the patient and
to filter positive recommendations among all the available recommendations:
• Recommendation Evaluation – Whenever a detected recommendation is provided
to the patient, he/she can rate the same whether they liked or disliked it. This way,
unwanted recommendations can be filtered in future similar scenarios.
• Goal Evaluation – Whenever patient data is inserted which can modify the current
state of a defined goal, it will be evaluated to understand whether the patient was
capable of achieving the desired state configured in the coaching plan or if the state
associated to an already achieved goal was deteriorated into a previous state.
• Patient and System Interaction Evaluation – Different data can be obtained from the
interaction between the patient and the system. In this case, it is considered both
system utilization rate (which corresponds to utilization times and frequency of use
of the system, and response time (verify whether the patient answered a provided
recommendation or not and the corresponding response time). This information can
then be used to readjust deadlines and understand the most adequate times during
the day to interact with the user.
This way it will be possible to avoid unnecessary and very repetitive interactions
with the patient which may tire him/her and only increase his/her disinterest to keep
using the developed system. All the previous feedback mechanisms are considered in
the adaptative goal setting procedure that is executed automatically every day to
evaluate and readjust goals based on user performance for that day. For this, we have
taken into account the model proposed by Akker and colleagues in [14] where they
defined an automated personalized goal-setting feature in the context of physical
activity coaching in which they determined the goal line for an upcoming day by
combining either stored data from that day of the week or in default parameters defined
by the healthcare professional with the new acquired data. We have considered a
similar process which updates the coaching plan goals automatically every single day
by comparing the current acquired data from that day with the historical data (or with
the default parameters in case no data was provided by the user until then) for the same
day. We have considered both goals completion rate and average goals’ difficulty as
performance measures to identify if the user improved or worsened and depending on
the difference between both values the goals for the upcoming days will be updated
accordingly. After that we will consider the data obtained from patient and system
interaction to measure if an established deadline to achieve a certain goal could also be
adapted depending on the average utilization rate and response time that is obtained.
Plan Completion. The third step considered is the completion of the defined coaching
plan. The condition necessary to complete a defined plan, and as explained above, is
whenever the defined goals (excluding all the intermediate goals) have been achieved.
After this the patient is provided with a report containing all the information on his/her
performance while executing the coaching plan which includes the total number of
goals achieved (including all the intermediate goals) and other metrics such as the time
needed to achieve those goals, the number of deteriorations verified, the number of
generated recommendations while following the coaching plan, the number of
approved and disapproved recommendations, among others.
Plan Post Completion. The last step is the coaching plan completion in which the
achieved results are verified after the plan has been completed. As such, whenever the
patient provides more clinical data after he/she has completed a coaching plan, that
information will be verified once again to understand if the patient health condition was
deteriorated and if any achieved result has been compromised (For example, if the
patient completed a smoking cessation coaching plan successfully and then started
smoking again). As a result, the healthcare professional will be notified so that he/she
can set a new coaching plan for that patient.
3 Conclusions and Future Work
The increasing number of people suffering from CORD has led to an overload of
healthcare resources to monitor and support patients in the management of their dis-
ease. Traditional methods of aiding these patients are no longer cost-effective nor
adequate more so when new treatments combining technological developments become
more relevant and allow patients to better self-monitor and self-manage their health
condition. In this context, mobile coaching technologies can exploit the different fea-
tures and embedded sensors available on the smartphone and are now being considered
as an alternative option to directly monitor patients with CORD. The solution proposed
by the PHE system brings further advantages by providing a healthcare solution that
does not require any additional external devices other than the smartphone itself and
that is therefore more friendly and appellative cost wise to the patient and that can be
easily integrated in his/her daily life.
In this work we have presented the overall architecture of the coaching module
which is integrated in the PHE system and that is composed, among several compo-
nents, of a coaching plan which is used to guide patients with CORD to adopt better
and healthier behaviors. We have provided a conceptual definition of the different
phases necessary for this component to operate correctly and explained how it can
automatically adapt itself to the user preferences and interactions with the PHE system.
As future work we intend to integrate the defined coaching plan in the developed
protype for the PHE system and study its effectiveness and usability in a real case
scenario. After that, and as we collect more data from the interactions between the
patient and the PHE system, we will be able to apply more intelligent mechanisms
(predictive analytics) to enhance the interactions and recommendations provided to the
user and predict whether a certain interaction or recommendation is adequate at a given
moment in time or not.
Acknowledgments. The work presented in this paper has been developed under the EUREKA -
ITEA3 Project PHE (PHE-16040), and by National Funds through FCT (Fundação para a
Ciência e a Tecnologia) under the projects UID/EEA/00760/2019 and UID/CEC/00319/2019 and
by NORTE-01-0247-FEDER-033275 (AIRDOC - “Aplicação móvel Inteligente para suporte
individualizado e monitorização da função e sons Respiratórios de Doentes Obstrutivos Cróni-
cos”) by NORTE 2020 (Programa Operacional Regional do Norte).
References
1. GOLD: Pocket Guide to COPD Diagnosis, Management and Prevention. A guide for Health
Care Professionals. 2019 Report (2019)
2. Naghavi, M., Abajobir, A.A., Abbafati, C., Abbas, K.M., Abd-Allah, F., Abera, S.F.,
Aboyans, V., Adetokunboh, O., Afshin, A., Agrawal, A.: Global, regional, and national age-
sex specific mortality for 264 causes of death, 1980–2016: a systematic analysis for the
Global Burden of Disease Study 2016. Lancet 390, 1151–1210 (2017)
3. European Respiratory Society: The global impact of respiratory disease. Forum of
International Respiratory Societies (2017)
4. Gibson, G.J., Loddenkemper, R., Lundbäck, B., Sibille, Y.: Respiratory health and disease in
Europe: the new European Lung White Book. European Respiratory Society (2013)
5. Gobbi, C., Hsuan, J.: Collaborative purchasing of complex technologies in healthcare:
implications for alignment strategies. Int. J. Oper. Prod. Manag. 35, 430–455 (2015)
6. Steinhubl, S.R., Muse, E.D., Topol, E.J.: Can mobile health technologies transform health
care? JAMA 310, 2395–2396 (2013)
7. Luxton, D.D., McCann, R.A., Bush, N.E., Mishkind, M.C., Reger, G.M.: mHealth for
mental health: integrating smartphone technology in behavioral healthcare. Prof. Psychol.
Res. Pract. 42, 505 (2011)
8. Almeida, A., Amaral, R., Sá-Sousa, A., Martins, C., Jacinto, T., Pereira, M., Pinho, B.,
Rodrigues, P.P., Freitas, A., Marreiros, G.: FRASIS-Monitorização da função respiratória na
asma utilizando os sensores integrados do smartphone. Revista Portuguesa de Imunoaler-
gologia 26, 273–283 (2018)
9. Deterding, S., Sicart, M., Nacke, L., O’Hara, K., Dixon, D.: Gamification. Using game-
design elements in non-gaming contexts. In: Extended Abstracts on Human Factors in
Computing Systems, CHI 2011, pp. 2425–2428. ACM (2011)
10. Tinschert, P., Jakob, R., Barata, F., Kramer, J.-N., Kowatsch, T.: The potential of mobile
apps for improving asthma self-management: a review of publicly available and well-
adopted asthma apps. JMIR mHealth uHealth 5, e113 (2017)
11. Bashshur, R.L., Shannon, G.W., Smith, B.R., Alverson, D.C., Antoniotti, N., Barsan, W.G.,
Bashshur, N., Brown, E.M., Coye, M.J., Doarn, C.R.: The empirical foundations of
telemedicine interventions for chronic disease management. Telemed. e-Health 20, 769–800
(2014)
12. Watson, H.A., Tribe, R.M., Shennan, A.H.: The role of medical smartphone apps in clinical
decision-support: a literature review. Artif. Intell. Med. 101707 (2019)
13. Vieira, A., Martinho, D., Martins, C., Almeida, A., Marreiros, G.: Defining an architecture
for a coaching module to support self-monitoring of chronic obstructive respiratory diseases.
Stud. Health Technol. Inform. 262, 130–133 (2019)
14. Cabrita, M., op den Akker, H., Achterkamp, R., Hermens, H.J., Vollenbroek-Hutten, M.M.:
Automated personalized goal-setting in an activity coaching application. In: SENSORNETS,
pp. 389–396 (2014)
Reviewing Data Analytics Techniques in Breast
Cancer Treatment
Mahmoud Ezzat1 and Ali Idri1,2(&)

1
Mohammed VI Polytechnic University, Benguerir, Morocco
[email protected]
2
Software Project Management Research Team,
ENSIAS, Mohammed V University in Rabat, Rabat, Morocco
[email protected]
Abstract. Data mining (DM) or Data Analytics is the process of extracting new
valuable information from large quantities of data; it is reshaping many indus-
tries including the medical one. Its contribution to medicine is very important
particularly in oncology. Breast cancer is the most common type of cancer in the
world and it occurs almost entirely in women, but men can get attacked too.
Researchers over the world are trying every day to improve, prevention,
detection and treatment of Breast Cancer (BC) in order to provide more effective
treatments to patients. In this vein, the present paper carried out a systematic
map of the use of data mining technique in breast cancer treatment. The aim was
to analyse and synthetize studies on DM applied to breast cancer treatment. In
this regard, 44 relevant articles published between 1991 and 2019 were selected
and classified according to three criteria: year and channel of publication,
research type through DM contribution in BC treatment and DM techniques. Of
course, there are not many articles for treatment, because the researchers have
been interested in the diagnosis with the different classification techniques, and
it may be because of the importance of early diagnosis to avoid danger. Results
show that papers were published in different channels (especially journals or
conferences), researchers follow the DM pipeline to deal with a BC treatment,
the challenge is to reduce the number of non-classified patients, and affect them
in the most appropriate group to follow the suitable treatment, and classification
was the most used task of DM applied to BC treatment.
Keywords: Data mining Knowledge data discovery Breast cancer

treatment Medical informatics
1 Introduction
Breast cancer is the most common cancer and cause of death among women every year
[1]; it often causes confusion as to the adequate treatment to be adopted in different
cases. The field of BC treatment using DM techniques has known an important pro-
gress and researchers has become more interested about the topic, since the medical
decision-makers require to be supported by DM techniques. Every day, the occurrence
of BC is increasing [2], and researchers should be aware of that, so studies in this sense,
66 M. Ezzat and A. Idri
especially in the treatment task should be richer. As the treatments for BC are
improving [3], patients could live longer with even the most advanced BC. Nowadays
since it is possible to access BC medical data, and given the powerful DM techniques
supporting the decision making in treatment, we could establish a strong strategy to
deal with BC treatment using either precision medicine and/or DM tools to draw the
roadmap through a robust decision making framework [3].
The most types of treatments in BC are surgery, radiotherapy, chemotherapy,
hormone therapy and biological therapy [4, 5]. The appeal to DM in medical field is
increasing, in particular in BC [4]. This is because DM is providing a variety of
techniques and tools dealing with complex problems [6]. In fact, DM could be defined
as the process of browsing data to extract useful knowledge. DM could behave under
two faces, either machine learning using artificial intelligence techniques, or statistical-
based techniques. BC treatment got advantage of the variety of DM objectives (clas-
sification, regression, clustering and association) to provide useful solution to oncol-
ogists [6]. However, according to the best of authors’ knowledge, no systematic
mapping study (SMS) was carried out to synthesize and summarize the findings of
primary studies dealing with the use of DM techniques for BC treatment, which
motivates the present study. Thus, the present study conduct a systematic map on
primary studies published in SpringerLink, PubMed, ACM, Google scholar, Science
Direct and IEEExplore between the period of 1991 to 2019. A set of 44 papers were
selected, synthesized, and classified according to: year and channel of publication,
research type, and DM techniques used.
The paper is composed of 5 sections. Section 2 shows the methodology followed to
carry out the present SMS. Section 3 presents the results of research questions. Sec-
tion 4 discusses the results. Finally, conclusion and future work are presented in
Sect. 5.
The goal of a SMS is to build a classification scheme to structure a field of interest [4].
Whilst SMS involves a horizontal approach to the published studies, Systematic Lit-
erature Review (SLR) discusses and analyses the processes and outcomes of previous
works vertically. The SMS process can be summed up in five steps: Defining the
research question, conducting a search, screening the papers, assigning keywords to
each paper by using the abstract and data extracting and mapping results.
2.1 Research Questions and Search String

The aim of this SMS is to establish a broad idea of the published studies on the use of DM
techniques to deal with BC treatments. For a broad perspective of the topic, we translated
the global goal to three research questions, shown on Table 1, with their rationales.
After defining the research questions, a search string is needed to find the most
relevant papers for the analysis. Therefore, we targeted the candidate papers through
looking forward six digital libraries: IEEExplore, PubMed, SpringerLink, Science
Direct, ACM and Google Scholar. We elaborated a search string to excel the operation
Reviewing Data Analytics Techniques in Breast Cancer Treatment 67
Table 1. Research question

Research Content Rationale
question ID
RQ1 What are the publications sources and To indicate whether there are
in which years were the selected specific publication channels and
studies related to data-mining when effort regarding this research
application for Breast area was made
cancer treatment published?
RQ2 How far data mining has contributed To discover the type of contribution
in making decisions on breast cancer of DM to the field of BC treatment
treatment?
RQ3 What are the most common DM To identify the most common DM
techniques and methods to deal with techniques investigated in BC
BC treatment? treatment
of search, and this was done by gathering terms and synonyms figured in the RQs. We
associated OR the alternatives and AND to link most present terms. The resulted search
string was: (Breast OR “Mammary gland” OR “Chemotherapy” OR “mammography”)
AND (cancer* OR tumor OR malignancy OR masses).
AND (treatment OR cure OR medication OR Prognosis) OR (Identification OR
Analysis OR monitoring) AND (data mining* OR machine learning* OR analytics*
OR categorization* OR intelligent OR classificat* OR cluster* OR associat* OR
predict*) AND (model* OR algorithm* OR technique* OR rule* OR method* OR
tool* OR framework* OR recommend).
2.2 Study Selection

The selection process allows us to filter the most relevant papers for our SMS knowing
their titles, abstract and keywords. In this context, we evaluated candidate papers in
terms of several inclusion and exclusion criteria (IC/EC) to select the relevant ones in
order to answer the RQs. We note that we applied OR for IC, AND for EC. Hereafter
the ICs and ECs we used in this study:
• IC1: Papers proposing new or using existing machine learning techniques for BC
treatment.
• IC2: Papers presenting an overview on the use of machine learning techniques for
BC treatment.
• IC3: Papers providing empirical/theoretical comparisons of machine learning
techniques for BC treatment.
• IC4: Papers published between 1991 and February 2019.
• EC1: Papers written in other languages than English.
• EC2: Papers dealing with other types of cancer than BC.
• EC3: Papers dealing with BC diagnosis, screening or prognosis.
• EC4: Duplicated papers.

• EC5: Short papers (with only 2–3 pages).
• EC6: Presentations or posters.
2.3 Data Extraction

A data extraction strategy was established by filling a form through which the most
relevant information was selected to answer the RQ:
• RQ1: requires the publication source, channel, date, author and abstract relatively to
each of the selected studies.
• RQ2: The selected papers can be classified into the following types according to the
topic they introduced [4]:
✓ Evaluation Research (ER): Evaluation of a DM technique to deal with BC
treatment was adopted.
✓ Solution Proposal (SP): Proposition of a new DM approach for the treatment
of BC.
✓ Experience Papers (EP): The researchers report their result when experiencing a
DM tool or technique applied to BC treatment.
✓ Review: Works mapping the present situation of BC treatment with DM.
• RQ3: Requires identification of DM techniques used in previous BC treatment
literature.
Data extraction bias: We all know the sensitivity of this task of data extraction, we
must be very attentive during this step. To avoid bad data extraction, we use an excel
file to evaluate carefully the selected papers able to feed our research questions.
3 Results
This section reports the results relatively to the research questions of the Table 1. To do
so, we present first an overview of the selection process, then the outcome of RQs 1–3.
3.1 Selection Process

Figure 1 shows that 300 candidate papers were found using the search string applied to
the six digital libraries. After filtering using the ECs, we retained 139 papers, then we
discarded 59 studies using the ICs. After that, eliminated duplicated studies to finally
select 44 papers to answer the RQs 1–3.
Fig. 1. Selection process.
3.2 RQ1: What Are Publications Sources and in Which Years Were
the Selected Studies Related to Data-Mining Application for Breast
Cancer Treatment Published?
Table 2 shows that the 44 selected papers were published in different channels
(especially journals or conferences). 44.72% of studies were published in journals,
38.64% were found in conferences, while 28.36% had books or reports as a source.
Table 2. Publications sources

Source # of (%)
papers
Conference
ACM-BCB: ACM International Conference on Bioinformatics, 2 4.55
Computational Biology, and Health Informatics
Annual International Conference of the IEEE Engineering in Medicine 1 2.27
and Biology – Proceedings
International Conference on Information and Knowledge Management, 1 2.27
Proceedings
Other conferences 13 29.55
Total Conferences 17 38.64
Journals
Breast Cancer Research and Treatment 3 6.38
Journal of Medical Systems 3 6.38
Journal of Biomedical Informatics 2 4.26
Other journals 13 27.68
Total Journals 21 44.72
Other sources 6 16.64
We observe from Table 2 that Breast Cancer Research and Treatment (6.38%),
Journal of Medical Systems (6.38%) and Journal of Biomedical Informatics (4.26%)
are the most targeted journals, and the ACM-BCB conference has published only two
papers (4.55%).
We observe from Fig. 2 that researchers finally tend to focus on BC treatment. This
could explain the inefficient work regarding BC treatment in the past. However, the
number of studies showed an increasing rate from the year 2016 to 2019; 40% of
studies done in 2016 were published in conferences, in 2017, 66.6% were published in
a conference, and during 2019, 100% of studies were published in journals.
Fig. 2. Distribution of papers over years and sources.
3.3 RQ2: How Far Data Mining Has Contributed in Making Decisions
on Breast Cancer Treatment?
The selected papers could be divided into four research types: Evaluation Research
(ER) [8], Solution Proposal (SP) [7], Experience Papers or empirical evaluation
(EP) [9] and Reviews [10]: 46.51% of selected papers were SP, proposing new DM
tools or techniques dealing with BC treatment, 14% are ER, 16% were considered as R,
and 23.26% of the papers were classified as Experience [11]. Note that the Experience
studies are in general difficult to carry out due to the difficulty of getting data of
patients, moreover it needs a medical expertise to evaluate and validate the outcome.
The nature of the treatment task requires the experience, so we need solution proposal,
experience papers which will be evaluated in ER papers, and we need studies that will
gather all this in reviews.
0
1991 2001 2005 2007 2008 2010 2011 2012 2013 2014 2015 2016 2017 2019
SP ER EP R
Fig. 3. Types of contributions over years
3.4 RQ3: What Are the Most Commons Techniques and Methods to Deal
with BC Treatment?
Figure 4 shows the distributions of DM techniques used according to each DM
objective. The four objectives were investigated with: Association: 21.4%, Classifi-
cation: 52.3%, Clustering: 9.5%, and Prediction: 16.6%. We observe that DT [12–15]
is the most used DM technique for classification with a percentage of 50%, followed by
Fuzzy logic based models (18%), then SVM [16] (18.1%) and association rules [17]
(18.1%), after that we found Neural networks [18] GA [18] and BN [19] with a
percentage of 4.5% for each. For the clustering objective, K-means [20] (75%) is the
most frequent followed by association rules with a percentage of 25%, As for pre-
diction Neural networks and Decision trees are the most used with a percentage of 43%
each, followed by association rules (14.3%). Association rules are the most present
when it comes to association with a percentage of (55.6), then fuzzy logic-based
models and Apriori with a percentage of 22.2% each.
Fig. 4. Distribution of DM techniques per objective (ARM: Association rules mining. FM:
Fuzzy methods. BN: Bayesian network. DT: Decision tree. GA: Genetic algorithm. NN: Neural
networks. SVM: Support vector machine).
4 Discussion
This section discusses the results of this systematic map of data analytics in BC
treatment. It also analyses the results obtained for each RQ.
4.1 RQ1: What Are Publication Sources, and in Which Years Were
the Selected Studies Related to Data-Mining Application for Breast
Cancer Treatment Published?
This study selected 44 relevant articles dealing with DM techniques for BC treatment.
The variety of sources could be explained by the variety of DM techniques and
objectives to solve real world problems. These sources have relationship with computer
science, data analytics applied to BC treatment. Even though the number of studies in
BC treatment is still low, it has been increased in the last years. Therefore, we conclude
that the coming years will come up with very interesting outcomes in treatment of BC
using DM techniques; based on the result of Fig. 3 where SP is the highest type of
studies.
4.2 RQ2: How Far Data Mining Has Contributed in Making Decisions
on Breast Cancer Treatment?
Figure 3 shows that the number of SP is the highest, which can be explained by the fact
that the DM based solutions could bring an interesting push for BC treatment studies.
SP has reached its top in 2017, whereas experience papers start to show up remarkably
in 2017; the evaluation researches are not present due to the lack of empirical evalu-
ations to assess the treatment task. This shows that the use of DM techniques in BC
treatment is still not mature. EP studies are presents and this is very important in the
medical context, because it gives more credibility to any study guiding to the suitable
treatment.
4.3 RQ3: What Are the Most Common DM Techniques to Deal with BC
Treatment?
Several DM techniques were evaluated, and Decision tree is the most frequent DM
technique used when it comes to the classification objective which is the most recurrent
task in BC treatment. Moreover, the association rules are the best choice for researchers
to deal with association task; we could also note that the famous k-means still the most
powerful technique for clustering issues, whereas the neural networks still very
accurate for prediction problems.
We could explain that, by the fact that decision trees are faster and easier to use for
classification, especially with the large choice of libraries offering the possibility to get
advantage of this technique by reusing the functionalities with a large choice of lan-
guages. In addition, decision trees can be easily interpreted by the oncologist while
taking the decision, without any background of data mining. As for association, the
association rules, still the most preferred technique among researchers for the associ-
ation task, they are easily understood for oncologists who can therefore trust them
when deciding on the BC treatments. For clustering, k-means is still the most powerful
and popular clustering techniques, because it is easy to use, and can provide accurate
clustering [20]. For the prediction, neural networks were widely used due to their
robustness to model complex relationships and their flexibility to be adapted to more
complex situations [6–8].
Since there is no DM technique that can outperform all the others in all contexts,
many selected studies combined more than two DM techniques to deal with a specific
DM objective for BC treatment [21]. Also, combining more than two techniques allows
avoiding limitations and consolidating advantages of the used techniques [22].
This study carried out a systematic map of data analytics in breast cancer treatment. It
summarized and analyzed 44 selected papers published between 1991 and 2019
according to three RQs. The findings per RQ are: (RQ1) The use of DM among
researchers has increased during the last years; the number of publications has
remarkably increased since 2016; Most of the papers (44.72%) were published in
journals and (38.64%) were found in conferences. (RQ2) This SMS found out that the
contribution of DM in BC treatment is very low but has increased recently. Therefore,
researchers should devote more effort to the treatment task. Most of the selected papers
were captured as SP and EP. (RQ3) Classification is the most frequent objective in DM,
because the problem is a classification one. For classification, the Decision tree gained
more interest during the years followed by fuzzy methods and SVM. As for future
work we aim to: (1) Get advantage of this SMS outcome to perform a systematic
literature review about the ML techniques investigated in BC treatment. (2) Implement
a solution of BC treatment by evaluating the different ML techniques.
References
1. Soria, D., Garibaldi, J.M., Green, A.R., Powe, D.G., Nolan, C.C., Lemetre, C., Ball, G.R.,
Ellis, I.O.: A quantifier-based fuzzy classification system for breast cancer patients. Artif.
Intell. Med. 58, 175–184 (2013). https://doi.org/10.1016/j.artmed.2013.04.006
2. Umesh, D.R., Ramachandra, B.: Association rule mining-based predicting breast cancer
recurrence on SEER breast cancer data. In: 2015 International Conference on Emerging
Research in Electronics, Computer Science and Technology, ICERECT 2015, pp. 376–380
(2016). https://doi.org/10.1109/ERECT.2015.7499044
3. Alford, S.H., Michal, O.-F., Ya’ara, G.: Harvesting population data to aid treatment
decisions in heavily pre-treated advanced breast cancer. Breast 36, S76 (2017). https://doi.
org/10.1016/s0960-9776(17)30764-6
4. Idri, A., Chlioui, I., Ouassif, B.E.: A systematic map of data analytics in breast cancer. In:
Proceedings of the Australasian Computer Science Week Multiconference on - ACSW 2018,
pp. 1–10. ACM Press, Brisband (2018)
5. Breast Cancer (female) - Treatment - NHS Choices. http://www.nhs.uk/Conditions/Cancer-
of-the-breast
6. Khrouch, S., Ezziyyani, M., Ezziyyani, M.: Decision System for the Selection of the Best
Therapeutic Protocol for Breast Cancer Based on Advanced Data Mining: A Survey.
Springer, Cham (2019)
7. Fan, Q., Zhu, C.J., Xiao, J.Y., Wang, B.H., Yin, L., Xu, X.L., Rong, F.: An application of
Apriori Algorithm in SEER breast cancer data. In: Proceedings - International Conference on
Artificial Intelligence and Computer Intelligence, AICI 2010, vol. 3, pp. 114–116 (2010).
https://doi.org/10.1109/AICI.2010.263
8. Tran, W.T., Jerzak, K., Lu, F.-I., Klein, J., Tabbarah, S., Lagree, A., Wu, T., Rosado-
Mendez, I., Law, E., Saednia, K., Sadeghi-Naini, A.: Personalized breast cancer treatments
using artificial intelligence in radiomics and pathomics. J. Med. Imaging Radiat. Sci. 50, 1–
10 (2019). https://doi.org/10.1016/j.jmir.2019.07.010
9. Shen, S., Wang, Y., Zheng, G., Jia, D., Lu, A., Jiang, M.: Exploring rules of traditional
Chinese medicine external therapy and food therapy in treatment of mammary gland
hyperplasia with text mining. In: Proceedings - 2014 IEEE International Conference on
Bioinformatics and Biomedicine, IEEE BIBM 2014, pp. 158–159 (2014). https://doi.org/10.
1109/BIBM.2014.6999347
10. Ondrouskova, E., Sommerova, L., Nenutil, R., Coufal, O., Bouchal, P., Vojtesek, B., Hrstka,
R.: AGR2 associates with HER2 expression predicting poor outcome in subset of estrogen
receptor negative breast cancer patients. Exp. Mol. Pathol. 102, 280–283 (2017). https://doi.
org/10.1016/j.yexmp.2017.02.016
11. Oskouei, R.J., Kor, N.M., Maleki, S.A.: Data mining and medical world: breast cancers’
diagnosis, treatment, prognosis and challenges. Am. J. Cancer Res. 7, 610–627 (2017)
12. Razavi, A.R., Gill, H., Ahlfeldt, H., Shahsavar, N.: Predicting metastasis in breast cancer:
comparing a decision tree with domain experts. J. Med. Syst. 31, 263–273 (2007). https://
doi.org/10.1007/s10916-007-9064-1
13. Chao, C.M., Yu, Y.W., Cheng, B.W., Kuo, Y.L.: Construction the model on the breast
cancer survival analysis use support vector machine, logistic regression and decision tree.
J. Med. Syst. 38, 1–7 (2014). https://doi.org/10.1007/s10916-014-0106-1
14. Kuo, W.J., Chang, R.F., Chen, D.R., Lee, C.C.: Data mining with decision trees for
diagnosis of breast tumor in medical ultrasonic images. Breast Cancer Res. Treat. 66, 51–57
(2001). https://doi.org/10.1023/A:1010676701382
15. Takada, M., Sugimoto, M., Ohno, S., Kuroi, K., Sato, N., Bando, H., Masuda, N., Iwata, H.,
Kondo, M., Sasano, H., Chow, L.W.C., Inamoto, T., Naito, Y., Tomita, M., Toi, M.:
Predictions of the pathological response to neoadjuvant chemotherapy in patients with
primary breast cancer using a data mining technique. Breast Cancer Res. Treat. 134, 661–
670 (2012). https://doi.org/10.1007/s10549-012-2109-2
16. Coelho, D., Sael, L.: Breast and prostate cancer expression similarity analysis by iterative
SVM based ensemble gene selection. In: Proceedings of International Conference on
Information and Knowledge Management, pp. 23–26 (2013). https://doi.org/10.1145/
2512089.2512099
17. He, Y., Zheng, X., Sit, C., Loo, W.T.Y., Wang, Z.Y., Xie, T., Jia, B., Ye, Q., Tsui, K.,
Chow, L.W.C., Chen, J.: Using association rules mining to explore pattern of Chinese
medicinal formulae (prescription) in treating and preventing breast cancer recurrence and
metastasis. J. Transl. Med. 10(Suppl 1), 1–8 (2012). https://doi.org/10.1186/1479-5876-10-
s1-s12
18. Hasan, M., Büyüktahtakın, E., Elamin, E.: A multi-criteria ranking algorithm (MCRA) for
determining breast cancer therapy. Omega U. K. 82, 83–101 (2019). https://doi.org/10.1016/
j.omega.2017.12.005
19. Turki, T., Wei, Z.: Learning approaches to improve prediction of drug sensitivity in breast
cancer patients. In: Proceedings of Annual International Conferences of the IEEE
Engineering in Medicine and Biology Society, EMBS, October 2016, pp. 3314–3320
(2016). https://doi.org/10.1109/EMBC.2016.7591437
20. Radha, R., Rajendiran, P.: Using K-means clustering technique to study of breast cancer. In:
Proceedings - 2014 World Congress on Computing and Communication Technologies,
WCCCT 2014, pp. 211–214 (2014). https://doi.org/10.1109/WCCCT.2014.64
21. Fahrudin, T.M., Syarif, I., Barakbah, A.R.: Feature selection algorithm using information
gain based clustering for supporting the treatment process of breast cancer. In: 2016
International Conference on Informatics and Computing, ICIC 2016, pp. 6–11 (2017).
https://doi.org/10.1109/IAC.2016.7905680
22. Çakır, A., Demirel, B.: A software tool for determination of breast cancer treatment methods
using data mining approach. J. Med. Syst. 35, 1503–1511 (2011). https://doi.org/10.1007/
s10916-009-9427-x
Enabling Smart Homes Through Health
Informatics and Internet of Things
for Enhanced Living Environments
Gonçalo Marques1,2(&) and Rui Pitarma2,3

1
Instituto de Telecomunicações, Universidade da Beira Interior,
6201-001 Covilhã, Portugal
[email protected]
2
Polytechnic Institute of Guarda, 6300-559 Guarda, Portugal
[email protected]
3
CISE - Electromechatronic Systems Research Centre,
Universidade da Beira Interior, 6201-001 Covilhã, Portugal
Abstract. As people spend most of their time inside buildings, indoor envi-
ronment quality must be monitored in real-time for enhanced living environ-
ments and occupational health. Indoor environmental quality assessment is
based on the satisfaction of the thermal, sound, light and air quality conditions.
The indoor quality patterns can be directly used to promote health and well-
being. With the proliferation of the Internet of Things related technologies,
smart homes must incorporate monitoring solutions for data acquisition, trans-
mission, and microsensors for several real-time monitoring activities. This paper
presents a low-cost and scalable multi-sensor smart home solution based on
Internet of Things for enhanced indoor quality considering acoustic, thermal and
luminous comfort. The proposed system incorporates three sensor modules for
data collection and use Wi-Fi communication technology for Internet access.
The system has been developed using open-source and mobile computing
technologies for real-time data visualization and analytics. The acquisition
modules incorporate light intensity and colour temperature, particulate matter,
formaldehyde, relative humidity, ambient temperature and sound sensor capa-
bilities. The results have successfully validated the scalability, reliability and
easy installation of the proposed system.
Keywords: Ambient assisted living Enhanced living environments Health

informatics Indoor environmental quality Internet of Things Smart home
1 Introduction
The proliferation and continuous technological improvements in numerous fields of

computer science contribute every day to the decrease the cost of the smart homes
design. The smart home concept aims to address numerous applications and present an
effective and efficient method for the digitalization of people’s daily routine activities
and to promote health and well-being [1]. Smart homes incorporate an ecosystem of
medical systems, which include medical sensors, microcontrollers, wireless
Enabling Smart Homes Through Health Informatics and IoT for ELE 77
communication technologies, and open-source software platforms for data visualization

and analytics. Therefore, the smart homes present a relevant potential to address several
healthcare issues through the incorporation of mobile computing technologies and
medical systems.
The Internet of Things (IoT) is the concept that involves the ubiquitous presence of
a diversity of cyber-physical systems that support sensing and communication capa-
bilities [2]. Ambient Assisted Living (AAL) is a multi-disciplinary domain which is
related to new methods for personalized healthcare systems using microcontrollers,
sensors, actuators, computer networks, open-source frameworks and mobile computing
technologies to design enhanced living environments (ELE) [3, 4]. IoT provides several
benefits to smart homes, healthcare and AAL [5]. Smart homes are typically designed
to support older adults in order to integrate healthcare systems and real-time monitoring
features for enhanced occupational health and well-being. The ELE is a concept closely
related to the AAL field. However, ELE are more associated with information and
communications technologies than AAL [6]. The smart home concept can be directly
associated with the ELE research field as the smart homes incorporate algorithms,
platforms, and systems to maintain an independent and autonomous living of older
adults for as long as possible. A smart home incorporates a set of hardware and
software systems that deliver a wide range of services to improve health and well-being
for all the individuals in general and older adults in particular.
People typically spend most of their time inside buildings. Therefore, indoor
environmental quality (IEQ) must be monitored in real-time for enhanced occupational
health and well-being. IEQ assessment is based on the satisfaction of the thermal
comfort, sound, light and air quality conditions [7].
Thermal comfort is a primary concern of the occupants and is usually achieved with
temperature ranges of 17–30 °C and depends on as physical factors such as humidity
and air temperature but also by individual considerations [8]. Therefore, thermal
comfort is not easy to measure and study.
Acoustic comfort also has a direct impact on people’s health and well-being. The
noise effects on health are related to annoyance, sleep and cognitive performance for
both adults and children but can also be associated with raised blood pressure [9].
Noise pollution is a risk factor for people who have pregnancy-related hypertension and
preeclampsia [10]. Moreover, noise exposure is also associated with cardiovascular
disease [11], psychiatric problems and anti-social behavior. People’s health and well-
being directly depend on their sleep quality which is affected by sound levels [12].
Therefore the developed countries have designed policies and laws for noise regulation
[13]. The World Health Organization states that noise exposure is increasing in Europe
[14]. Considering the noise pollution effect on health is particularly relevant to monitor
the indoor living environments for ELE and occupational health [15]. It is also perti-
nent to develop more aggressive policies for sound level supervision [16]. Furthermore,
most people are concerned about the health problems related to noise pollution and
acknowledge the critical need to design efficient and effective mechanisms for noise
assessment and control [17].
78 G. Marques and R. Pitarma
The Environmental Protection Agency ranked indoor air quality (IAQ) on the top
five environmental risks to public health [18]. Therefore, IAQ monitoring must be a
requisite for all buildings. Reduced air quality levels are associated with numerous
health effects such as headaches, dizziness, restlessness, difficulty breathing, increase
heart rate, elevated blood pressure, coma and asphyxia [19–21].
The indoor light levels are also related to people’s health, well-being [22, 23] and
daylight exposure in buildings is also related to energy costs [24]. Luminous comfort
corresponds to the individual’s satisfaction regarding the environmental light levels,
and thermal comfort depends on physical parameters, which can be measured, such as
light intensity and color but also in personalized conditions. People’s attention on this
topic has been increased as now is perceived that light levels are directly related to
people’s psychological health, performance and productivity [25].
The IEQ assessment can perceive patterns on the indoor living quality, which can
be directly used to plan interventions for ELE. Regarding the proliferation of IoT
technologies, smart homes must incorporate different monitoring solutions that make
use of open source technologies for data acquisition, transmission, and microsensors
for several monitoring activities such as noise monitoring, activity recognition, and
thermal and light comfort assessment [26–33]. Therefore, this paper presents an inte-
grated solution for IEQ, which provides thermal, acoustic and luminous comfort
supervision. This solution incorporates open-source and mobile computing technolo-
gies for data consulting and analysis. The rest of the paper is structured as follows:
Sect. 2 presents the materials and methods used in the design of the proposed solution;
Sect. 3 presents the results and discussion, and the conclusion is presented in Sect. 4.
The system architecture of the proposed multi-sensor smart home solution is presented
in Fig. 1. The proposed method uses a native Wi-Fi compatible microcontroller for data
acquisition, process and transmission. The data collected is stored in a SQL Service
Fig. 1. System architecture.

database using a web application program interface (API) developed in .NET. This API
contains the web services to receive and manage the data collected by the microcon-
troller and also to provide the data output for webpage visualization and analytics
features.
The proposed method incorporates several sensing features such as light intensity
and colour temperature, particulate matter (PM), formaldehyde, relative humidity,
ambient temperature and sound level using three sensor modules. Each sensor module
is connected to an ESP8266 microcontroller. The sensor selection was conducted with
the primary goal of creating an ELE to promote occupational health and enhanced IEQ
(Fig. 2).
Fig. 2. The proposed multi-sensor system block diagrams representing the sensor’s
components.
The PMS5003ST sensor (Beijing Plantower Co., Ltd., Beijing, China) has been
used for air quality and thermal comfort assessment. This sensor supports temperature,
humidity, PM and formaldehyde sensing features. It is a 5 V sensor which has a
100 mA and 200 lA for active and standby current consumption and a response time
lesser than 10 s. The particle counting efficiency is 98%, the PM2.5 measurement range
is 0–2000 ug/m3, and the maximum error is ±10 (PM2.5 100–500 lg/m3). The tem-
perature range is from −10 °C to 50 °C, and the maximum error is ±0.5 °C. The
relative humidity range is from 0–90%, and the maximum error is ±2%. Regarding the
formaldehyde sensing capabilities, the range is 0–2 mg/m3, and the maximum error is
less than ±5% of the output value. The PMS5003ST is connected using the I2C
interface.
The acoustic comfort is monitored using the calibrated sound sensor (DFRobot,
Shanghai, China), which is connected using analogue communication. This sensor has
a measurement range of 30 dBA–130 dBA with a measurement error of ±1.5 dBA.
The frequency response is 31.5 Hz–8.5 kHz and the response time is 125 ms.
The TCS3472 sensor (Adafruit Industries, New York, United States) has been
selected to monitor the luminous comfort. This sensor can detect RGB light color
temperature and intensity levels. This sensor supports high sensitivity and dynamic
range, which allow a reliable lighting conditions assessment. This sensor is connected
using the I2C interface.
The data collected by the proposed solution not only can be used to provide a
reliable IEQ assessment of the monitored space but also to support the energy man-
agement of the building using the web portal anywhere and anytime. The cost of the
systems is presented in Table 1, and the total system cost is below 175 USD.
Table 1. System prototype cost.

Component Units Price (USD)
ESP8266 3 20,97
PMS5003ST 1 53,65
Sound sensor 1 60,30
TCS3472 1 10,19
Prototyping case 3 27,00
The proposed multi-sensor system has been designed using the ESP8266, a low-
cost Wi-Fi microchip developed by Espressif Systems in Shanghai, China. This
microcontroller incorporates a 32-bit RISC microprocessor core based on the Tensilica
Xtensa Diamond Standard 106Micro with 80 MHz clock speed and supports 32 KiB
instruction RAM (Fig. 2). The modules are powered using a 230 V–5 V AC-DC 2 A
power supply. This smart home system is based on Wi-Fi connectivity for Internet
access to provide real-time IEQ data monitoring. Furthermore, the system supports easy
Wi-Fi configuration using a Wi-Fi compatible device with a web browser. When the
system is connected to a Wi-Fi network, the access credentials are saved on the
hardware memory for future access. If no saved network is available, the system enters
in hotspot mode, and the user can access this hotspot to configure the Wi-Fi network
which the system should be connected. After initialization, the system performs data
acquisition, and the data is then processed. If the defined timer is overflowed, the
system performs the data transmission and sends the collected data to the database for
storage. The sensing activities are performed every 15 s, but this timmer can be
updated according to the user’s requirements. Figure 3 represents the flowchart of the
sensor modules used in the proposed multi-sensor smart home system.
Fig. 3. Flow diagram of the acquisition module used in the proposed multi-sensor system.
The proposed smart home solution supports IAQ, thermal comfort, luminous comfort
and acoustic comfort for enhanced occupational health and well-being. The proposed
system has been tested in a laboratory of a Portuguese university (Fig. 4). The mon-
itored room is typically occupied per 15 persons, 4 h per day, five days per week and is
used for teaching activities. The laboratory is constituted by two rooms. The room has
an area around 64 m2 and was monitored in real-time for two months.
Fig. 4. Installation schema of the tests conducted. R – router; 1 – luminous comfort module, 2 –
IAQ and thermal comfort module, 3 – acoustic comfort module.
The tests performed have the primary goal of testing the system functional
requirements of the proposed monitoring system. The data collected ensures the
operability and performance of the proposed smart home system for real-time data
collection and visualization.
Table 2. Comparison of the proposed systems and smart home monitoring solutions available in
the literature.
Microcontroller Sensors Connectivity IAQ Acoustic Luminous Thermal
comfort comfort comfort
PIC Temperature, relative nRF24L01 √ √
24F16KA102 [34] humidity and CO2
Arduino UNO CO2 ZigBee √
[35]
Waspmote [36] CO, CO2, PM, ZigBee √ √
temperature and
relative humidity
STM PM, temperature and IEEE √ √
32F103RC [37] relative humidity 802.15.4 k
ESP8266 Temperature, relative Wi-Fi √ √ √ √
[proposed system] humidity, noise, PM,
formaldehyde, light
Several cost-effective and open-source monitoring systems are proposed by [34–

38], a summarised comparison review is presented in Table 2. From the analysis of
Table 2, it is possible to conclude that the referred solutions are developed with dif-
ferent microcontrollers using Arduino, PIC, Waspmote and STMicroelectronics. The
proposed solution uses an ESP8266 with CPU clock speed of 80 MHz, which is higher
than used by the authors of [34] PIC (32 MHz), [35] Arduino (16 MHz), [36]
Waspmote (14.74 MHz) and [37] STMicroelectronics (72 MHz). Regarding connec-
tivity, all the presented methods referred to in Table 2 incorporate wireless commu-
nication technologies. The proposed system uses Wi-Fi as a standard communication
methodology implemented in most buildings in developed countries.
All the proposed methods presented in Table 2 support IAQ monitoring features;
however, acoustic comfort and luminous comfort is not conducted in any of the studies.
This smart home system proposed system in this paper provides integrated IEQ
monitoring for enhanced occupational health and includes six types of sensorial
capabilities, and other sensors can be added for monitoring specific parameters. Fur-
thermore, this system provides an easy installation process that can be done by the end-
user. On the one hand, the easy configuration, which avoids installation costs and is
based on different modules which lead to enhanced scalability as installation can start
using one unit and new modules regarding the needs of the case study. On the other
hand, the proposed system is based on open-source technologies, which is particularly
significant because other researchers and manufacturers can develop new compatible
sensors to be integrated into this smart home solution. However, the proposed smart
home solution has some limitations. The prototype appearance needs to be improved,
and the proposed method needs additional experimental validation to ensure calibration
and accuracy. In the future, the authors aim to test the sensors’ accuracy by study the
real average output error and to calculate the real response time for each sensor.
Moreover, the primary goal is to make technical improvements, including the
development of critical alerts and notifications to notify the building manager when the
thermal, acoustic and luminous comfort requirements are not meet.
4 Conclusion
In this paper, a low-cost, open-source and scalable multi-sensor smart home solution
based on IoT for enhanced IEQ considering acoustic, thermal and luminous comfort, is
presented. The proposed method incorporates three sensor modules for data collection
and uses Wi-Fi communication technology for Internet access. The data collected is
available in real-time for data visualization and analytics through a web portal. This
smart home solution provides easy installation and easy Wi-Fi configuration methods.
Furthermore, the proposed solution was successfully tested and validated to ensure the
functional architecture. The tests conducted present positive results on behalf of an
essential contribution to enhanced occupational health and well-being. Furthermore,
based on the data collected in the tests performed, we conclude that under certain
conditions, IEQ circumstances are significantly lower than those considered healthy for
people’s health and well-being. Nevertheless, the proposed needs further experimental
validation to ensure calibration and accuracy.
References
1. Wilson, C., Hargreaves, T., Hauxwell-Baldwin, R.: Smart homes and their users: a
systematic analysis and key challenges. Pers. Ubiquit. Comput. 19, 463–476 (2015)
2. Marques, G., Pitarma, R., Garcia, N.M., Pombo, N.: Internet of Things architectures,
technologies, applications, challenges, and future directions for enhanced living environ-
ments and healthcare systems: a review. Electronics 8, 1081 (2019). https://doi.org/10.3390/
electronics8101081
3. Ganchev, I., Garcia, N.M., Dobre, C., Mavromoustakis, C.X., Goleva, R. (eds.): Enhanced
Living Environments: Algorithms, Architectures, Platforms, and Systems. Springer, Cham
(2019). https://doi.org/10.1007/978-3-030-10752-9
4. Marques, G., Garcia, N., Pombo, N.: A survey on IoT: architectures, elements, applications,
QoS, platforms and security concepts. In: Mavromoustakis, C.X., Mastorakis, G., Dobre, C.
(eds.) Advances in Mobile Cloud Computing and Big Data in the 5G Era, pp. 115–130.
Springer, Cham (2017). https://doi.org/10.1007/978-3-319-45145-9_5
5. Marques, G.: Ambient assisted living and Internet of Things. In: Cardoso, P.J.S., Monteiro,
J., Semião, J., Rodrigues, J.M.F. (eds.) Harnessing the Internet of Everything (IoE) for
Accelerated Innovation Opportunities, pp. 100–115. IGI Global, Hershey (2019). https://doi.
org/10.4018/978-1-5225-7332-6.ch005
6. Dobre, C., Mavromoustakis, C.X., Garcia, N.M., Mastorakis, G., Goleva, R.I.: Introduction
to the AAL and ELE systems. In: Ambient Assisted Living and Enhanced Living
Environments, pp. 1–16. Elsevier (2017). https://doi.org/10.1016/B978-0-12-805195-5.
00001-6
7. Yang, L., Yan, H., Lam, J.C.: Thermal comfort and building energy consumption
implications – a review. Appl. Energy 115, 164–173 (2014). https://doi.org/10.1016/j.
apenergy.2013.10.062
8. Havenith, G., Holmér, I., Parsons, K.: Personal factors in thermal comfort assessment:
clothing properties and metabolic heat production. Energy Build. 34, 581–591 (2002).
https://doi.org/10.1016/S0378-7788(02)00008-7
9. Stansfeld, S.A., Matheson, M.P.: Noise pollution: non-auditory effects on health. Br. Med.
Bull. 68, 243–257 (2003). https://doi.org/10.1093/bmb/ldg033
10. Auger, N., Duplaix, M., Bilodeau-Bertrand, M., Lo, E., Smargiassi, A.: Environmental noise
pollution and risk of preeclampsia. Environ. Pollut. 239, 599–606 (2018). https://doi.org/10.
1016/j.envpol.2018.04.060
11. Foraster, M., Eze, I.C., Schaffner, E., Vienneau, D., Héritier, H., Endes, S., Rudzik, F.,
Thiesse, L., Pieren, R., Schindler, C., Schmidt-Trucksäss, A., Brink, M., Cajochen, C., Marc
Wunderli, J., Röösli, M., Probst-Hensch, N.: Exposure to road, railway, and aircraft noise
and arterial stiffness in the SAPALDIA study: annual average noise levels and temporal
noise characteristics. Environ. Health Perspect. 125, 097004 (2017). https://doi.org/10.1289/
EHP1136
12. Gupta, A., Gupta, A., Jain, K., Gupta, S.: Noise pollution and impact on children health.
Indian J. Pediatr. 85, 300–306 (2018). https://doi.org/10.1007/s12098-017-2579-7
13. Zanella, A., Bui, N., Castellani, A., Vangelista, L., Zorzi, M.: Internet of Things for smart
cities. IEEE Internet Things J. 1, 22–32 (2014). https://doi.org/10.1109/JIOT.2014.2306328
14. Murphy, E., King, E.A.: An assessment of residential exposure to environmental noise at a
shipping port. Environ. Int. 63, 207–215 (2014). https://doi.org/10.1016/j.envint.2013.11.
001
15. Murphy, E., King, E.A.: Environmental noise and health. In: Environmental Noise Pollution,
pp. 51–80. Elsevier (2014). https://doi.org/10.1016/B978-0-12-411595-8.00003-3
16. Stansfeld, S.: Noise effects on health in the context of air pollution exposure. Int. J. Environ.
Res. Public Health 12, 12735–12760 (2015). https://doi.org/10.3390/ijerph121012735
17. Morillas, J.M.B., Gozalo, G.R., González, D.M., Moraga, P.A., Vílchez-Gómez, R.: Noise
pollution and urban planning. Curr. Pollut. Rep. 4, 208–219 (2018). https://doi.org/10.1007/
s40726-018-0095-7
18. Seguel, J.M., Merrill, R., Seguel, D., Campagna, A.C.: Indoor air quality. Am. J. Lifestyle
Med. 11(4), 284–295 (2016). https://doi.org/10.1177/1559827616653343
19. Tsai, W.-T.: Overview of green building material (GBM) policies and guidelines with
relevance to indoor air quality management in Taiwan. Environments 5, 4 (2017). https://doi.
org/10.3390/environments5010004
20. Singleton, R., Salkoski, A.J., Bulkow, L., Fish, C., Dobson, J., Albertson, L., Skarada, J.,
Ritter, T., Kovesi, T., Hennessy, T.W.: Impact of home remediation and household
education on indoor air quality, respiratory visits and symptoms in Alaska native children.
Int. J. Circumpolar Health 77, 1422669 (2018). https://doi.org/10.1080/22423982.2017.
1422669
21. Bruce, N., Pope, D., Rehfuess, E., Balakrishnan, K., Adair-Rohani, H., Dora, C.: WHO
indoor air quality guidelines on household fuel combustion: strategy implications of new
evidence on interventions and exposure–risk functions. Atmos. Environ. 106, 451–457
(2015). https://doi.org/10.1016/j.atmosenv.2014.08.064
22. Azmoon, H., Dehghan, H., Akbari, J., Souri, S.: The relationship between thermal comfort
and light intensity with sleep quality and eye tiredness in shift work nurses. J. Environ.
Public Health 2013, 1–5 (2013). https://doi.org/10.1155/2013/639184
23. Gropper, E.I.: Promoting health by promoting comfort. Nurs. Forum 27, 5–8 (1992). https://
doi.org/10.1111/j.1744-6198.1992.tb00905.x
24. Xue, P., Mak, C.M., Cheung, H.D.: The effects of daylighting and human behavior on
luminous comfort in residential buildings: a questionnaire survey. Build. Environ. 81, 51–59
(2014). https://doi.org/10.1016/j.buildenv.2014.06.011
25. Hwang, T., Kim, J.T.: Effects of indoor lighting on occupants’ visual comfort and eye health
in a green building. Indoor Built Environ. 20, 75–90 (2011). https://doi.org/10.1177/
1420326X10392017
26. Marques, G., Roque Ferreira, C., Pitarma, R.: A system based on the Internet of Things for
real-time particle monitoring in buildings. Int. J. Environ. Res. Public Health 15, 821 (2018).
https://doi.org/10.3390/ijerph15040821
27. Feria, F., Salcedo Parra, O.J., Reyes Daza, B.S.: Design of an architecture for medical
applications in IoT. In: Luo, Y. (ed.) Cooperative Design, Visualization, and Engineering,
pp. 263–270. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46771-9_34
28. Marques, G., Pitarma, R.: A cost-effective air quality supervision solution for enhanced
living environments through the Internet of Things. Electronics 8, 170 (2019). https://doi.
org/10.3390/electronics8020170
29. Marques, G., Ferreira, C.R., Pitarma, R.: Indoor air quality assessment using a CO2
monitoring system based on Internet of Things. J. Med. Syst. 43, 67 (2019). https://doi.org/
10.1007/s10916-019-1184-x
30. Marques, G., Pitarma, R.: mHealth: indoor environmental quality measuring system for
enhanced health and well-being based on Internet of Things. JSAN 8, 43 (2019). https://doi.
org/10.3390/jsan8030043
31. Marques, G., Pitarma, R.: Noise monitoring for enhanced living environments based on
Internet of Things. In: Rocha, Á., Adeli, H., Reis, L.P., Costanzo, S. (eds.) New Knowledge
in Information Systems and Technologies, pp. 45–54. Springer, Cham (2019). https://doi.
org/10.1007/978-3-030-16187-3_5
32. Marques, G., Pitarma, R.: Noise mapping through mobile crowdsourcing for enhanced living
environments. In: Rodrigues, J.M.F., Cardoso, P.J.S., Monteiro, J., Lam, R., Krzhizha-
novskaya, V.V., Lees, M.H., Dongarra, J.J., Sloot, P.M.A. (eds.) Computational Science –
ICCS 2019, pp. 670–679. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-22744-
9_52
33. Marques, G., Pitarma, R.: Air quality through automated mobile sensing and wireless sensor
networks for enhanced living environments. In: 2019 14th Iberian Conference on
Information Systems and Technologies (CISTI), Coimbra, pp. 1–7. IEEE (2019). https://
doi.org/10.23919/CISTI.2019.8760641
34. Shah, J., Mishra, B.: IoT enabled environmental monitoring system for smart cities. In: 2016
International Conference on Internet of Things and Applications (IOTA), Pune, pp. 383–
388. IEEE (2016). https://doi.org/10.1109/IOTA.2016.7562757
35. Salamone, F., Belussi, L., Danza, L., Galanos, T., Ghellere, M., Meroni, I.: Design and
development of a nearable wireless system to control indoor air quality and indoor lighting
quality. Sensors 17, 1021 (2017). https://doi.org/10.3390/s17051021
36. Bhattacharya, S., Sridevi, S., Pitchiah, R.: Indoor air quality monitoring using wireless
sensor network. Presented at the December (2012). https://doi.org/10.1109/ICSensT.2012.
6461713
37. Zheng, K., Zhao, S., Yang, Z., Xiong, X., Xiang, W.: Design and implementation of LPWA-
based air quality monitoring system. IEEE Access 4, 3238–3245 (2016). https://doi.org/10.
1109/ACCESS.2016.2582153
38. Gao, Y., Dong, W., Guo, K., Liu, X., Chen, Y., Liu, X., Bu, J., Chen, C.: Mosaic: a low-cost
mobile sensing system for urban air quality monitoring. In: IEEE INFOCOM 2016 - The
35th Annual IEEE International Conference on Computer Communications, San Francisco,
pp. 1–9. IEEE (2016). https://doi.org/10.1109/INFOCOM.2016.7524478
MyContraception: An Evidence-Based
Contraception mPHR for Better Contraceptive
Fit
Manal Kharbouch1, Ali Idri1,2(&), Taoufiq Rachad1, Hassan Alami3,

Leanne Redman4, and Youssef Stelate1
1
Software Project Management Research Team,
Department of Web and Mobile Engineering, ENSIAS,
Mohamed V University in Rabat, Rabat, Morocco
[email protected]
2
CSEHS, University Mohammed VI Polytechnic, Ben Guerir, Morocco
3
Faculty of Medicine, University Mohammed V, Rabat, Morocco
4
Pennington Biomedical Research Center, Baton Rouge, LA 70808, USA
Abstract. The fulfillment of unmet needs for contraception can help women
reach their reproductive goals. It was proven to have a significant impact on
reducing the rates of unintended pregnancies, and thereby cut the number of
morbidity and mortality resulting from these pregnancies, and improving the
lives of women and children in general. Therefore, there is a growing concern
worldwide about contraception and women’s knowledge of making an advised-
choice about it. In this aspect, an outgrown number of apps are now available
providing clinical resources, digital guides, or educational information con-
cerning contraception whether it concerns natural contraception or modern
contraception. However, vast amounts of these apps contain inaccurate sexual
health facts and non-evidence based information concerning contraception. On
these bases, and in respect to the needs of women to effectively prevent unin-
tended pregnancies while conducting a stress-free healthy lifestyle, the World
Health Organization (WHO) Medical Eligibility Criteria (MEC) for contracep-
tion’s recommendations, and the results and recommendations of a field study
conducted in the reproductive health center Les Oranges in Rabat to collect the
app’s requirements, we developed an Android app named ‘MyContraception’.
Our solution is an evidence-based patient-centered contraceptive app that has
been developed in an attempt to facilitate: (1) Seeking evidence-based infor-
mation along with recommendations concerning the best contraceptive fit (ac-
cording to one’s medical characteristics, preferences and priorities) helping
users make informed decisions about their contraceptive choices. (2) Monitoring
one’s own menstrual cycle, fertility window, contraceptive methods usage, and
the correlation between these different elements and everyday symptoms in one
app. (3) Keeping record of one’s family medical history, medical appointments,
analyses, diagnoses, procedures and notes within the same app. In future work,
conducting an empirical evaluation of MyContraception solution is intended, to
exhaustively examine the effects of this solution in improving the quality of
patient-centered contraception care.
Keywords: Contraception mPHR MEC WHO Android

MyContraception: An Evidence-Based Contraception mPHR 87
1 Introduction
Although the majority of women seeking out contraceptive measures are most likely to
be young, healthy, and with less medical challenges than women over 35 years old,
teenagers, or those with intercurrent diseases [1]. Yet, health care providers often
prescribe contraceptives to women of reproductive age with core medical conditions as
well [2]. Despite the fact that contraceptive counseling can be a challenge overall, it can
get more complicated in the presence of concomitant diseases or risk factors [3]. In this
vein, women with comorbidities may not receive adequate counseling on contraceptive
methods [2].
The first contraception consultation is of crucial importance that it requires a
minimum recommended time of 30 min [1]. Beyond sufficient time, offering a wide
range of contraceptive methods, evidence-based knowledge of the efficacy, risks, and
benefits of the different methods, as well as building a respectful and confidential
relationship between the doctor and the women, are key quality features of good
contraceptive counseling, allowing women to make informed decisions [3]. However,
many health care providers may find this protocol quite intimidating in practice.
Consequently, iatrogenic unintended pregnancies are a reality. Since they result from
errors or omissions that can be avoided during the consultation, especially the omission
of sufficient time [1]. Moreover, obsolete clinical guidelines and lack of knowledge of
new evidence can limit both the quality of contraceptive counseling and the user’s
access to safe and effective contraception [3].
According to the World Health Organization (WHO), an estimate of 33 million
unintended pregnancies over the world are a result of contraceptive failure or incorrect
use [4]. At a worldwide level, unintended pregnancy was and still one of the most
public health issues; it is considered the main sexual and reproductive health issue
associated with the highest risk of morbidity and mortality for women [5]. Women with
chronic conditions can have serious health consequences in the event of an unwanted
pregnancy. Since pregnancy can aggravate certain diseases or associate them with
harmful consequences endangering the life of the woman. In addition, drugs used to
treat many chronic diseases are potentially teratogenic [2], affect the development of
the embryo and fetus and when exposed to a pregnant woman may cause birth defects,
fetal loss or abnormal growth and development [6].
With the fast pace of medical advancement in the reproductive health sector,
especially contraception, quick, reliable, and accurate access to evidence-based infor-
mation is mandatory for health care providers to provide quality care to women based
on the most current available evidence [7]. In compliance with the expansion of
technology, the number of web and mobile applications (apps) available now to assist
clinicians in providing care for women is increasing. It is also becoming increasingly
common for women to use technology in the form of websites and apps to monitor and
track their cycles for fertility purposes and to inquire about contraception [8]. However,
only a few are reliable and exhaustive source of information [9]. In this light and taking
advantage of new technologies, we have developed an evidence-based Mobile Personal
Health Record (mPHR) to provide interactive, individually tailored information and
decision support for contraceptive use. The app is meant to prepare women for their
88 M. Kharbouch et al.
contraception consultations with health care providers and perform as a clinician

extender to support the delivery of evidence-based contraception awareness and
enhance the overall quality of patient-centered contraception care.
The rest of the paper is organized as follows: Sect. 2 explains the Fertility
Awareness-Based (FAB) contraception and the WHO’s Medical Eligibility Criteria
(MEC) for contraceptive use. MyContraception solution, its purpose, and specifications
are detailed in Sect. 3. While Sect. 4 combines the development tools and the imple-
mentation of MyContraception solution previously presented in Sect. 3. Finally, Sect. 5
highlights this work’s conclusions and its future perspectives.
2 Theoretical Contraception Aspects
2.1 Fertility Awareness Method

The Fertility Awareness Method (FAM) supported is a form of natural birth control to
prevent unwanted pregnancy. FAM-based apps are designed around a statistical
algorithm that provides a ‘safe’ or ‘unsafe’ result to the user regarding the risk of
pregnancy [10]. The algorithm takes into account the day of ovulation, the yellow
phase, the follicular phase, the duration of the cycle, as well as the average temperature
between the different phases, and set the safe/unsafe periods. Some apps, support
adding luteinizing hormone (LH) test results as well for better accuracy [11]. However,
Fertility awareness methods are commonly misperceived as traditional methods and
thus are often left out of family planning programming [12].
2.2 Modern Contraceptive Methods

Modern contraceptive methods are techniques and technologies designed to overcome
biology and allow complete sexual freedom while reducing the risk of pregnancy [13].
According to this definition, various products, and medical approaches are defined as
modern contraceptives: short-acting contraceptives like pills, injectables and condoms;
Long-Acting Reversible Contraceptives (LARC) such as implants and Intrauterine
Devices and systems (IUDs); and permanent contraceptive methods so-called steril-
ization to name few. However, some beliefs and the fear of the side effects of modern
contraceptive methods push women to resort to less effective traditional methods.
Given this, On the one hand, women with chronic conditions may not be able to safely
use traditional contraceptive methods, as the risks associated with pregnancy may be
too high. On the other hand, health care providers are often less comfortable pre-
scribing contraception to patients with concomitant conditions, while contraception is
often safer than pregnancy for these women [14]. In this regard, the Centers for Disease
Control and Prevention (CDC) has developed a MEC for contraception based on the
WHO guidelines. Thus, evidence-based recommendations for safe and effective con-
traceptive methods for women with different medical characteristics and conditions are
provided [15].
2.3 Medical Eligibility Criteria for Contraception

Since 1996, in collaboration with the US Centers for Disease Control and Prevention
(CDC), the WHO has been publishing an evidence-based manual referred to as
“Medical Eligibility Criteria for contraceptive use (MEC)” [16]. This manual involving
a set of medical criteria for the selection of effective contraceptive methods. It is a four-
level risk classification of various contraceptive methods, not only in certain physio-
logical situations such as postpartum and breastfeeding but also in the presence of
concomitant diseases and risk factors. Apps that do implement the WHO’s MEC for
contraception are decision aids that add up the scores for each contraceptive method to
suggest best-fit choices. As there are no perfect choices when it comes to contraceptive
use, these decision aids weigh up different factors concerning the user’s profile and
medical history, in order to propose evidence-based suggestions concerning modern
contraceptives.
3 MyContraception Solution
3.1 Purpose
The use of contraception has become commonplace in modern society that nearly all
women are using contraception at some point in their lifetime [17]. Thus, when seeking
contraception, women need a justified, individualized contraceptive counseling in
which every decision about a contraceptive method, the advantages and drawbacks are
weighed and discussed individually [3]. Moreover, in order to achieve an optimal
contraceptive effect and a better adherence rate, Women should be involved in a shared
decision-making process [3].
In this respect, the main purpose of MyContraception solution consists of giving
women the control and ability to make an informed choice over contraception and to
organize and inform many other aspects of their contraception use. All in a convenient,
easy and discreet way. The fact of the matter is that these characteristics were recog-
nized to be valued by women when comes to their body decisions according to pre-
vious research in the field of health apps [18].
3.2 Requirements Specification

During the app development process, developers focused on creating a patient-centered
contraceptive application, where the entire content of the application is adapted to the
contraceptive method chosen from the Selected Practice Recommendations (SPR).
Whole in compliance with the following functional requirements.
• Download mobile application: A user should be able to download the mobile
application through an application repository. The app should be free to download.
• Update mobile application: The user should be able to download a new/updated
version or release of the app.
• User registration: Given that the user has downloaded the app, then she should be
able to register by providing login credentials (email, password).
• Login: Given that a user has registered, then the user should be able to log in to the
mobile application. The log in information will be stored on the phone and in the
future, the user should be logged in automatically.
• Retrieve password: A user should be able to retrieve her password by email.
• Consult ‘About Contraception’: The user should be able to consult the ‘About
Contraception’ section to learn more about contraceptive methods, eligibility cri-
teria, efficiency, risks and more.
• Enter Menstrual Cycle Information: The user should be able to enter the date of
her last menstrual cycle, its length, and duration of period among other information.
• Monitor Menstrual Period: The user should be able to track and predict her
period, ovulation and know about chances of falling pregnant on a specific day.
• Take ‘Eligibility Test’: The user should take an Eligibility Test based on WHO’s
MEC for contraception to obtain a list of her best-suited contraceptive methods.
• ‘Eligibility Test’ Result: Once the eligibility status identified, the user should
obtain information about her recommended contraceptive methods.
• Chose a Contraceptive Method: The user should be able to choose one of her
recommended contraceptive methods upon which the app will be adapted.
• View contraception history: The user should be able to visualize the dated list of
her past contraceptive methods.
• Receive reminders: The user should be reminded of her ovulation period, to take
her pill, schedule a medical checkup… based on her current contraceptive method.
• Receive notifications: The user should be notified when it is her predicted first/last
day of the period, when her menstrual cycle is abnormal and when she needs to
enter some information (symptoms, mood, weight, temperature…).
• Change reminders settings: The user should be able to choose how and when she
would like to receive reminders based on her current contraceptive method.
• Change notification settings: The user should be able to choose how and when she
would like to receive notifications.
• Archive Medical Notice: The user should be able to scan or upload pictures of her
medical notice from her gallery to her medical notice archive on the app and add
notes on them.
• Archive Medical analysis: The user should be able to scan or upload pictures of
her medical analysis from her gallery to her medical analysis archive on the app and
add notes on them.
• Consult Medical Notice Archive: The user should be able to consult her medical
notice archive on the app.
• Consult Medical analysis Archive: The user should be able to consult her medical
analysis archive on the app.
Previous studies had implemented ISO/IEC 25010 standard [19] to health-related
software products. Ouhbi et.al had applied this standard on Mobile Personal Health
Record (mPHR) [20], while Idri et.al had conducted an evaluation of free mobile personal
health records for pregnancy monitoring based on the aforementioned standard [21], and
a quality evaluation of gamified blood donation apps using the same standard [22].
Likewise, a set of non-functional requirements was deemed to be improving the software
quality of MyContraception solution. These requirements are quality characteristics of

the ISO/IEC 25010 quality model, and are described as following:
• Functional suitability: MyContraception solution should meet users’ needs (stated
and/or implied) through well-integrated functions and suitable resulted in content
with the needed degree of precision.
• Performance efficiency: MyContraception solution should have a short response
time to enhance User Experience (UX).
• Usability: MyContraception solution should help users to achieve specified goals
with effectiveness, efficiency, and satisfaction.
• Reliability: MyContraception solution should remain operational and accessible in
a specific manner under the possible circumstances (with/without internet
connection).
• Security: MyContraception solution should secure encrypted communication,
protection, and security of users’ accounts and sensitive information.
• Maintainability: MyContraception solution should have a readable and composed
of discrete components code to easily implement new functions and to avoid
introducing defects or degrading existing product quality.
3.3 Integration of Theoretical Contraception Aspects

The app, although a recent development, is based on medical protocols and the WHO
guidelines. In this regard, the first step lied in collecting reference-based information
about fertility and contraception. The WHO Medical Eligibility Criteria for Contra-
ceptive Use [16] and the WHO Selected Practice Recommendations for Contraceptive
Use [23] were the main sources for the scientific basis of this app. Second, the results
and recommendations of a field study conducted in the reproductive health center Les
Oranges in Rabat to collect the app’s requirements, along with the results and rec-
ommendations of an ongoing study reviewing features and functionalities of contra-
ception mPHRs, were elaborated into a Software Requirements Specifications (SRS).
The following step was the development of the interfaces. In which the users are able to
log their menstrual information and visualize predictions of both their coming periods
and ovulation windows, to take an eligibility test and visualize the results of the
medical criteria computed, and to organize and inform many other aspects of their
contraception use. All in a convenient, easy and discreet way. Subsequently, the
algorithms, which track menstrual period and fertility windows and computes the
WHO selected practice recommendations of each contraceptive option for all selected
medical conditions automatically were implemented. Finally, the app was debugged
and tested by the authors.
4 Implementation
Our contraception software solution is developed using native Android while data is
stored in Firebase cloud service in order to enable data backup, sharing logs, and
securing access to the application for privacy concerns. In the current phase of
development, the application is dedicated to patients exclusively and not linked to any
kind of clinician-centered application. Few user interfaces are shown in the Appendix.
The Appendix is accessible at the following link: https://www.um.es/giisw/manal/
Appendix.pdf.
When authenticated successfully with his email or an existing login system as
depicted in Fig. 1 and Fig. 2 of the Appendix, the user logs in her menstrual infor-
mation concerning her menstrual cycle as shown in Fig. 3 and Fig. 4 of the Appendix.
Then the user is redirected to the home page illustrated in Fig. 5 of the Appendix. From
there, the user can: (1) Monitor her period and fertility windows as in Fig. 6 of the
Appendix. (2) Log her specific symptoms, mood, measurements, analysis/notices
records, journaling and questions for her next obstetric appointment. See Fig. 7 and
Fig. 8 of the Appendix. (3) Take an eligibility quiz to obtain her best-fitted contra-
ceptive method based on her age, health condition, and medical history to cite few as
referred to in Fig. 11, Fig. 12 and Fig. 13 of the Appendix. Once the user picks her
current contraceptive method or chooses one from suggested methods according to her
eligibility test results as in Fig. 14 of the Appendix, the whole application is person-
alized to meet her selected contraceptive method. Moreover, the user can consult her
contraceptive history, medical archive, menstruation history, and past obstetric
appointments as can be seen in Fig. 9 of the Appendix, consult awareness section about
her contraceptive method as described in Fig. 10 of the Appendix, set a reminder for
future obstetric appointments as detailed in Fig. 15 of the Appendix, and change the
settings of the app. In the settings, as Fig. 16 of the Appendix shows, the user is
allowed to customize the content of the app to her liking, choose the language of the
app to have a fair understanding of its content, and manage how and when she would
like to receive reminders/notifications. The user can log out from the app at any time
and navigate smoothly between the different activities thanks to a material design-based
menu.
5 Conclusion and Future Perspectives
Eager to offer women a patient-centered comprehensive contraceptive counseling and

to help them make informed decisions concerning contraception use. An android
solution called ‘MyContraception’ was designed on the basis of the WHO’s guidelines
concerning contraception and on the needs that raised in a field study results that we
conducted in the reproductive health center Les Oranges in Rabat to collect the app’s
requirements. The app serves as a clinician extender that compensates for the lack of
time, comfort and skills that health care providers may face, and the overwhelming
changes in the field of contraception [24] to help overcome barriers to strengthening
sexual and reproductive health services [25–34].
In future work, it is intended to conduct an empirical evaluation with real partic-
ipants to assess the effectiveness of the app in matching women with the contraceptive
method best for them, the active engagement of women in monitoring their contra-
ceptive use, and the adoption and integration of this mPHR technology into clinical
practice.
Acknowledgments. This work was conducted within the research project PEER, 7-246 sup-
ported by the US Agency for International Development. The authors would like to thank the
National Academy of Science, Engineering, and Medicine, and USAID for their support.
References
1. Guillebaud, J.: Contraception Today, 9th edn. CRC Press, Boca Raton (2019)
2. Bonnema, R.A., McNamara, M.C., Spencer, A.L.: Contraception choices in women with
underlying medical conditions. Am. Fam. Phys. 82, 621–628 (2010)
3. Moffat, R., Sartorius, G., Raggi, A., et al.: Consultation de contraception basée sur
l’évidence. Forum Médical Suisse – Swiss Med. Forum (2019). https://doi.org/10.4414/fms.
2019.08065
4. World Health Organization (2014) WHO | Unsafe abortion: global and regional estimates of
the incidence of unsafe abortion and associated mortality in 2008. WHO. https://doi.org/10.
1017/CBO9781107415324.004
5. Kassahun, E.A., Zeleke, L.B., Dessie, A.A., et al.: Factors associated with unintended
pregnancy among women attending antenatal care in Maichew Town, Northern Ethiopia,
2017. BMC Res. Notes 12, 1–6 (2019). https://doi.org/10.1186/s13104-019-4419-5
6. Gweneth, L.: Pharmacovigilance in pregnancy. In: Doan, T., Renz, C., Bhattacharya, M.,
Lievano, F., Scarazzini, L. (eds.) Pharmacovigilance: A Practical Approach, 1st edn, p. 228.
Elsevier, Amsterdam (2019)
7. Arbour, M.W., Stec, M.A.: Mobile applications for women’s health and midwifery care: a
pocket reference for the 21st century. J. Midwifery Women’s Health (2018). https://doi.org/
10.1111/jmwh.12755
8. Mendes, A.: What’s new in the world of prescribing contraception? Nurse Prescr. 16, 410–
411 (2018). https://doi.org/10.12968/npre.2018.16.9.410
9. Rousseau, F., Da Silva Godineau, S.M., De Casabianca, C., et al.: State of knowledge on
smartphone applications concerning contraception: a systematic review. J. Gynecol. Obstet.
Hum. Reprod. 48, 83–89 (2019). https://doi.org/10.1016/j.jogoh.2018.11.001
10. Berglund Scherwitzl, E., Lundberg, O., Kopp Kallner, H., et al.: Perfect-use and typical-use
pearl index of a contraceptive mobile app. Contraception 96, 420–425 (2017). https://doi.
org/10.1016/j.contraception.2017.08.014
11. Berglund Scherwitzl, E., Gemzell Danielsson, K., Sellberg, J.A., Scherwitzl, R.: Fertility
awareness-based mobile application for contraception. Eur. J. Contracept. Reprod. Health
Care 21, 234–241 (2016). https://doi.org/10.3109/13625187.2016.1154143
12. Malarcher, S., Spieler, J., Fabic, M.S., et al.: Fertility awareness methods: distinctive modern
contraceptives. Glob. Health Sci. Pract. 4, 13–15 (2016)
13. Hubacher, D., Trussell, J.: A definition of modern contraceptive methods. Contraception 92,
420–421 (2015)
14. Chor, J., Rankin, K., Harwood, B., Handler, A.: Unintended pregnancy and postpartum
contraceptive use in women with and without chronic medical disease who experienced a live
birth. Contraception 84, 57–63 (2011). https://doi.org/10.1016/j.contraception.2010.11.018
15. Curtis, K.M., Tepper, N.K., Jatlaoui, T.C., et al.: U.S. medical eligibility criteria for
contraceptive use, 2016. MMWR Recomm. Rep. 65, 1–104 (2016). https://doi.org/10.1089/
jwh.2011.2851
16. WHO: Medical Eligibility Criteria for Contraceptive Use, 5th edn. WHO, Geneva (2015)
17. Daniels, K., Daugherty, J., Jones, J.: Current contraceptive status among women aged 15–
44: United States, 2011–2013. NCHS Data Brief 173, 1–8 (2014)
18. Newman, L.: Apps for health: what does the future hold? Br. J. Midwifery 26, 561 (2018).
https://doi.org/10.12968/bjom.2018.26.9.561
19. International Organization For Standardization ISO: Software Process Improvement
Practice. ISO/IEC 25010:34 (2011)
20. Ouhbi, S., Idri, A., Fern, L.: Applying ISO/IEC 25010 on mobile personal health records. In:
8th International Conference on Health Informatics, pp. 405–412 (2015)
21. Idri, A., Bachiri, M., Fernández-alemán, J.L., Toval, A.: ISO/IEC 25010 based evaluation of
free mobile personal health records for pregnancy monitoring. In: IEEE 41st Annual
Computing Software Application Conference, pp. 262–267 (2017)
22. Idri, A., Sardi, L., Fernández-alemán, J.: Quality evaluation of gamified blood donation apps
using ISO/IEC 25010 standard. In: 12th International Conference Health Informatics,
pp. 607–614 (2018)
23. WHO: Selected Practice Recommendations for Contraceptive Use, 3rd edn. WHO, Geneva
(2016)
24. Arbour, M.W., Stec, M.A.: Mobile applications for women’s health and midwifery care: a
pocket reference for the 21st century. J. Midwifery Women’s Health 63, 330–334 (2018).
https://doi.org/10.1111/jmwh.12755
25. Tebb, K.P., Trieu, S.L., Rico, R., et al.: A mobile health contraception decision support
intervention for Latina adolescents: Implementation evaluation for use in school-based
health centers. J. Med. Internet Res. 21 (2019). https://doi.org/10.2196/11163
26. Sardi, L., Idri, A., Readman, L.M., et al.: Mobile health applications for postnatal care:
review and analysis of functionalities and technical features. Comput. Methods Programs
Biomed. 184, 1–26 (2020). https://doi.org/10.1016/j.cmpb.2019.105114
27. Bachiri, M., Idri, A., Redman, L.M., et al.: A requirements catalog of mobile personal health
records for prenatal care. In: Lecture Notes in Computer Science (Including Subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), pp. 483–495.
Springer (2019)
28. Bachiri, M., Idri, A., Abran, A., et al.: Sizing prenatal mPHRs using COSMIC measurement
method. J. Med. Syst. 43, 1–11 (2019). https://doi.org/10.1007/s10916-019-1446-7
29. Bachiri, M., Idri, A., Redman, L., et al.: COSMIC functional size measurement of mobile
personal health records for pregnancy monitoring. In: Advances in Intelligent Systems and
Computing, pp. 24–33. Springer (2019)
30. Bachiri, M., Idri, A., Fernández-Alemán, J.L., Toval, A.: Evaluating the privacy policies of
mobile personal health records for pregnancy monitoring. J. Med. Syst. 42, 1–14 (2018).
https://doi.org/10.1007/s10916-018-1002-x
31. Idri, A., Bachiri, M., Fernández-Alemán, J.L., Toval, A.: Experiment design of free
pregnancy monitoring mobile personal health records quality evaluation. In: 2016 IEEE 18th
International Conference on e-Health Networking, Applications and Services, Healthcom
2016. Institute of Electrical and Electronics Engineers Inc., pp. 1–6 (2016)
32. Bachiri, M., Idri, A., Fernández-Alemán, J.L., Toval, A.: Mobile personal health records for
pregnancy monitoring functionalities: analysis and potential. Comput. Methods Programs
Biomed. 134, 121–135 (2016)
33. Bachiri, M., Idri, A., Fernandez-Aleman, J.L., Toval, A.: A preliminary study on the
evaluation of software product quality of pregnancy monitoring mPHRs. In: Proceedings of
2015 IEEE World Conference on Complex Systems, WCCS 2015. Institute of Electrical and
Electronics Engineers Inc., pp. 1–6 (2016)
34. Idri, A., Bachiri, M., Fernández-Alemán, J.L.: A framework for evaluating the software
product quality of pregnancy monitoring mobile personal health records. J. Med. Syst. 40, 1–
17 (2016). https://doi.org/10.1007/s10916-015-0415-z
Predictors of Acceptance and Rejection
of Online Peer Support Groups as a Digital
Wellbeing Tool
John McAlaney1(&), Manal Aldhayan1, Mohamed Basel Almourad2,

Sainabou Cham1, and Raian Ali3
1
Faculty of Science and Technology, Bournemouth University,
Bournemouth, UK
{jmcalaney,maldhayan,scham}@bournemouth.ac.uk
2
College of Technological Innovation, Zayed University, Dubai, UAE
[email protected]
3
College of Science and Engineering, Hamad Bin Khalifa University,
Doha, Qatar
[email protected]
Abstract. Digital media usage can be problematic; exhibiting symptoms of

behavioural addiction such as mood modification, tolerance, conflict, salience,
withdrawal symptoms and relapse. Google Digital Wellbeing and Apple Screen
Time are examples of an emerging family of tools to help people have a
healthier and more conscious relationship with technology. Peer support groups
is a known technique for behaviour change and relapse prevention. It can be
facilitated online, especially with advanced social networking techniques. Ele-
ments of peer support groups are being already embedded in digital wellbeing
tools, e.g. peer comparisons, peer commitments, collective usage limit-setting
and family time. However, there is a lack of research about the factors
influencing people acceptance and rejection of online peer support groups to
enhance digital wellbeing. Previous work has qualitatively explored the
acceptance and rejection factors to join and participate in such groups. In this
paper, we quantitatively study the relationship between culture, personality, self-
control, gender, willingness to join the groups and perception of their useful-
ness, on such acceptance and rejection factors. The qualitative phase included
two focus groups and 16 interviews while the quantitative phase consisted of a
survey (215 participants). We found a greater number of significant models to
predict rejection factors than acceptance factors, although in all cases the amount
of variance explained by the models was relatively small. This demonstrates the
need to design and, also, introduce such technique in a contextualised and
personalised style to avoid rejection and reactance.
Keywords: Online peer groups Digital addiction Digital wellbeing

Behavioural change
96 J. McAlaney et al.
1 Introduction
Digital media including social networks, gaming and online shopping have various
benefits and represent an integral part of modern society. Such media empower social
connectedness, information exchange and freedom of information exchange intro-
ducing a new lifestyle and concepts such as digital humanity and digital citizenships.
However, some compulsive and obsessive usage styles and over-reliance on digital
media can lead to negative consequences such as reduced involvement in real-life
communities and a lack of sleep [1]. Some usage styles can be seen as addictive
meeting common criteria of behavioural addiction such as salience, conflict, mood
modification, and relapse [2, 3].
There is a limited number of preventative, control and recovery mechanisms
available for Digital Addiction (DA). Although the problematic relationship with
technology has been recognised in a wide range of literature, DA is still not classified
as a mental disorder in the latest 5th edition of the Diagnostic and Statistical Manual of
Mental Disorders (DSM 5). Recently, in 2018, the World Health Organization
recognised Gaming Disorder, which represents a significant step is searching for pre-
ventative and recovery mechanisms. Most of the existing research on DA focuses on
the reasons for people to become overly reliant on social media and the relationship of
that with factors such as personality traits [4]. Few works have placed software design
at the centre of the DA problems, both in facilitating and also in combatting DA, e.g.
the digital addiction labels and the requirements engineering for digital well-being
requirements in [6, 7].
With the advances in sensing and communication technology and internet con-
nectivity, there has been a proliferation of software and smartphone applications to assist
with behavioural change. It is still questionable whether these solutions are effective and
whether we understand the acceptance and rejection factors from the users’ perspective.
The perception of their role and trustworthiness of such proposed solutions has changed
following some failures and the recognition of associated risks [8].
Linking the intention to change behaviour with the act of doing so is the main
purpose of behaviour change theories [5]. Peer support groups are one of the approaches
to behaviour change which can be utilised to combat addictive behaviours by providing
support and helping in relapse prevention [9, 11]. Peer support groups consist of people
sharing similar interests and in view of supporting and influencing each other’s beha-
viour towards achieving common goals [10]. Alrobai et al. [13] focused on the processes
involved when running the group, e.g. the roles involved in doing so and the steps to be
taken to prevent relapse. Aldhayan et al. [18], explored the acceptance and rejection
factors of online peer support groups by people with DA. This exploration was meant to
inform the strategies used to introduce such online peer group software, as well as the
configuration and governance processes of their online platform.
Hsiao Shu and Huang [17] explored the relationships between personality traits and
compulsive usage of social media apps, and showed that extraversion, agreeableness,
and neuroticism have significant effects on such compulsive usage. Being an online
Predictors of Acceptance and Rejection of Online Peer Support Groups 97
social technique for behaviour change itself, acceptance and rejection of peer support
groups could be in turn subject to such personal and environmental factors. In this
paper, we study the effect of personality traits, self-control, gender, and perception of
usefulness, willingness to join and culture (comparing UK to Middle Eastern users) on
the acceptance and rejection factors of online peer support groups. To achieve this
target, we designed a survey around the acceptance and rejection factors reported in
[18] and derived from two focus groups and 16 interviews. The survey also consisted
of various demographics questions and measures for personality [20] and self-control
[19]. We collected 215 completed responses. We report on the statistical analysis
results and discuss their implications on the design of future online peer support groups
to combat DA.
2 Research Method
We adopted a mixed-methods approach which consisted of an initial qualitative phase

followed by a quantitative one. The participants in both phases self-declared as
experiencing problematic digital behaviour and wellbeing issues.
2.1 Qualitative Phase: Exploring Acceptance, Rejection and Governance

We conducted a focus group study of two sessions. The first session aimed at getting
insights around how online peer groups are perceived by people self-declaring to have
DA and what they wished to see in it. The second focus group served the purpose of
identifying the design features of an online peer group platform. For this reason, mock
interfaces were made available to the second session participants based on the results of
the first focus group. The participants were asked about opinions regarding the mock
design and to amend them if needed. The two focus group sessions were conducted
with the same six university students; three male and three females, aged between 20
and 26. The participants were a social group in real life, and this was beneficial as it
removed concerns regarding trust and privacy during the discussion process. We
performed a thematic analysis [12] on the data collected through the sessions. This
analysis revealed main factors concerning the acceptance and rejection of this approach
as well as its governance styles and process.
The objective of the interview stage was to explore in-depth the acceptance and
rejection factors and the variability space of designing online peer groups platforms so
that we can accommodate different users’ preferences and governance styles. The
interview questions were based on the acceptance and rejection factors explored in the
focus groups as well as five themes related to governance, including group moderation,
feedback and monitoring, membership and exit protocol. We conducted 16 interviews
with students who self-declared to have a wellbeing issue around their digital
behaviour, e.g., obsessive or compulsive use. The sample consisted of 8 males and 8
females, aged between 18 and 35. Each interview lasted between 30 and 40 min. The
interviews were transcribed and analysed via thematic analysis [12].
2.2 Quantitative Phase: Confirmation, Personal and Environmental

Influences
This phase was based on a survey that reflected the interview themes, i.e. the accep-
tance and rejection themes as well as governance themes such as moderator role,
feedback, membership and exit procedure. The survey was disseminated both online
and in person. A £5 incentive was offered to respondents given the lengthy nature of
the survey. We collected 215 completed responses; 105 participants (49%) identified as
male and 109 participants (50%) identified as female, with the remaining 1% preferring
not to answer on the gender question. The participants were 17 to 55 years old. The
survey started with a validation question of whether a participant has wellbeing issues
as a precondition to take part.
To study the effect of personal and environmental factors on the acceptance and
rejection factors, the survey included questions around six factors which were gender
(male/female); country/culture (UK/Middle East); perceived usefulness of peer support
groups; willingness to join a peer support group; five personality traits [20] (ex-
traversion, agreeableness, conscientiousness, neuroticism and openness); and self-
control [19]. We disseminated the survey mainly in the UK, the Kingdom of Saudi
Arabia and Syria. We collected 104 completed surveys from KSA and Syria, and 85
from the UK. This allowed us to study statistically whether there was a difference
between Middle Eastern culture (KSA and Syria) and Western Culture (UK).
3 Acceptance and Rejection Factors
The factors which affect users’ acceptance and rejection of online peer support groups
to combat DA are presented in Tables 1 and 2, respectively. The elaborated descrip-
tions of themes A1 to A4 and R1 to R4 can be found in [18]. Further analysis of the
data revealed another theme, which is A5.
Table 1. Online peer support groups to combat digital addiction: acceptance factors
Acceptance theme Sub-themes
[A1] Accepting online peer groups as [A1.1] Provide awards: gamification of performance
an entertainment auxiliary [A1.2] Peer comparison: to see how I and others do
[A1.3] Goal achievement: rewards, information and
graphs of my progress towards the goal
[A2] Accepting online peer groups as [A2.1] Self-Monitoring: show actual usage and
a DA awareness tool performance
[A2.2] Peer comparison: benchmarking through others
[A2.3] Goal achievement: awareness of how I am
achieving goals
(continued)
Table 1. (continued)
Acceptance theme Sub-themes
[A3] Accepting online peer support [A3.1] Peer learning: learning from others how to
groups as an educational tool improve
[A3.2] Moderator role: learning from moderator,
learning from acting as moderator
[A3.3] Set up goals: learning how to set up SMART
goals
[A4] Accepting online peer support [A4.1] Peer feedback: alert/feedback through peer
groups as a prevention tool feedback
[A4.2] Moderator feedback: alert/feedback by a
moderator
[A4.3] Authority: steps and restrictions set by a
moderator
[A5] Accepting online peer support [A5.1] Provide advice: by experienced moderator;
groups as a support tool alternatives lifestyle
[A5.2] Emotional support: when struggling to avoid
relapse
[A5.3] Feedback: when performing well and under-
performing, sending warnings
Table 2. Online peer support groups to combat digital addiction: rejection factors
Rejection theme Sub-themes
[R1] Rejecting online peer support groups [R1.1] Negative feedback: dismissive feedback
when seen as intimidation tool when failing
[R1.2] Harsh penalty, e.g. banning and locking
out
[R2] Rejecting online peer support groups [R2.1] Being overly judged by a moderator
when seen as overly judgmental [R2.2] Being judged by peers, known and
unknown in person
[R3] Rejecting online peer supports group [R3.1] Weak management
when hosting unmanaged interactions [R3.2] Large group size
[R4] Rejecting online peer groups due to [R4.1] Relatedness: group including relatives
unclear membership protocol and friends
[R4.2] Exit control: free and uncontrolled exit as
well as conditions on exiting the group without
considering others
4 Personal and Cultural Effects on Acceptance and Rejection
The survey questions around acceptance and rejection can be found in Appendix A.
A Likert scale indicating level of agreements was used for each of the statements under
each theme. A series of linear multiple regressions using the enter method were
conducted. In each model the predictors were gender (male/female); region

(UK/Middle East); perceived usefulness of peer support groups; willingness to join a
peer support group; the five personality trait scores of extraversion, agreeableness,
conscientiousness, neuroticism and openness; and self-control score. For each model,
the outcome measure was the individual questions used to measure attitudes relating to
the acceptance and rejection factors of online peer groups, as identified within the
description of each model result in the section below. Multicollinearity diagnostics
were conducted prior to the analysis to determine the suitability of conducting multiple
regressions.
4.1 Effects on Acceptance Factors

[A1] Accepting online peer groups as an entertainment auxiliary. Three models under
this category were non-significant, which were [A1.1a] Awards when achieving
behavioural targets, e.g. points, badges, etc.; [A1.1b] Awards when making progress
towards the behavioural target; [A1.3] Information and graphs how I am progressing
to keep me engaged. The model for [A1.2] Peer comparisons, i.e. to see how I and
others are performing was significant, predicting 12% of the variance (R2 = .12,
F(10,159) = 2.16, p <.05). Of this extraversion was the only significant predictor
(b = .12), with an increase in extraversion being associated with an increase in
agreement towards this statement.
[A2] Accepting online peer support groups as an awareness tool. The first model under
this category was significant, which was [A2.1] Self-Monitoring, e.g. showing your
hourly, daily and weekly performance and progress indicator (R2 = .11, F(10,159) =
1.90, p <.05), accounting for 11% of the variance. Within this model the two significant
predictors were extraversion (b = 0.09) and neuroticism (b = .07). In each as the level of
the personality trait increased there was an increase in agreement towards this statement.
The other two models under this category were not significant. These were [A2.2] Peer
comparisons, e.g. comparing you to other members in the group who have similar profile
and level of problem; [A2.3] Awareness on goal setting, e.g. how to set and achieve goals,
and how to avoid deviation from the plan you sat to achieve them.
[A3] Accepting online peer support groups as an educational platform. None of the
models under this category was significant. These were [A3.1] Environment to learn
from a peers, e.g., by sharing real-life stories and successful strategies around the
wellbeing issue; [A3.2a] Environment to learn from experienced moderators, e.g. best
practice around the wellbeing issue; [A3.2b] Environment where I can learn through
acting as a mentor, i.e. when advising other members and when having to moderate the
group; [A3.3] Environment to learn how to set up achievable and effective goals and
their plans. This suggests that as an education tool, peer support groups acceptance is
not affected by differences in personal and environmental factors.
[A4] Accepting online peer support groups as a digital addiction prevention tool. None
of the models under this category was significant, which were [A4.1] Feedback mes-
sages sent by peers about performance and wellbeing goals. [A4.2] Guidance, feed-
back and information sent by moderators based on performance and achieving
wellbeing goals; [A4.3] Steps, restrictions and plans set by an authorised moderators,
e.g. game usage limit for compulsive gamers;
[A5] Accepting online peer support groups as a support tool. The first model for [A5.1a]
Environment to provide experienced moderators who are able to provide advice and
guide members to manage the wellbeing issue was significant (R2 = .12, F(10,159) =
2.01, p <.05), accounting for 12% of the variance. The only significant predictor was
neuroticism (b = .07), with an increase in this personality trait being associated with an
increase in acceptance towards this statement. The rest of regression models under this
category were not significant. These were [A5.1b] Environment to suggest alternative
activities to replace and distance myself from the negative behaviours and enhance
wellbeing; [A5.2] Environment to provide emotional support, e.g. when struggling to
follow the healthy behaviour; [A5.3a] Environment to get positive and motivational
feedback when performing well; [A5.3b] Environment to get positive and motivational
feedback even when failing to achieve targets; [A5.3c] Environment to issue warning
feedback when members performance and interaction are not right;. This again suggests
that influences are limited when peer groups are seen as knowledge and advice source.
4.2 Effects on Rejection Factors

[R1] Rejecting online peer support groups when seen as an intimidation tool. The
model for [R1.1a] I reject a group with negative feedback, e.g. you have repetitively
failed in achieving your target, this is the 5th time this month was significant (R2 = .11,
F(10,159) = 2, p <.05), accounting for 11% of the variance. Within the model, the only
significant predictor was openness (b = −.16). As such as there was an increase in
agreeableness there was a decrease in acceptance of this statement. The model for
[R1.2b] I reject a group with harsh feedback, e.g. Your interaction with peers shows
anti-social and disruptive patterns. You have been reported for annoying others was
significant (R2 = .12, F(10,159) = 2.3, p <.05), accounting for 12% of the variance.
Within the model the only significant predictor was gender (b = .4). This meant that
female participants were more likely to accept this statement. The model for [R1.2] I
reject a group with harsh penalties e.g. banning from the group for a period of time if I
repetitively forget my target was significant (R2 = .16, F(10,159) = 3.1, p <.05),
accounting for 16% of the variance. Within the model the significant predictors were
agreeableness (b = .13), neuroticism (b = .15) and self-control (b = −.04). As such as
agreeableness and neuroticism increased there was an increase in acceptance of this
statement, but as self-control increased there was a decrease in acceptance of this
statement.
[R2] Rejecting online peer support groups when seen as overly judgmental. Three of
the regression models under this category were not significant, which were [R2.1] I
reject a group if the group moderator judges my performance and interaction fre-
quently, even if this is for my benefit [R2.2a] I reject a group if I am judged by peers
who are only online contact, e.g. not real-life contacts; [R2.2c] I reject a group if the
judgement online expands to other life aspects by peers who are real-world contacts;.
The model for [R2.2b] I reject a group if I am judged by online peers who are also
real-world contacts was significant (R2 = .14, F(10,159) = 2.6, p <.01), accounting for
14% of the variance. Within the model, the only significant predictor was gender
(b = .56). This meant that female participants were more likely to accept this statement.
[R3] Rejecting online peer supports group when hosting unmanaged interactions. The
model for [R3.1a] I reject a group with a weak moderator, e.g. unable to stop or ban
members who are not adhering to the group norms was significant (R2 = .12,
F(10,159) = 2.1, p <.05), accounting for 12% of the variance. Within the model,
conscientiousness was the only significant predictor (b = 0.14), with an increase in this
trait being associated with an increase in agreement with this statement. The model for
[R3.1b] I reject a group which allows a loose and relaxed rules e.g. accepting con-
versations and interactions that are not related to the wellbeing issue, was significant
(R2 = .13, F(10,159) = 2.4, p <.05), accounting for 13% of the variance. Within the
model the predictors of conscientiousness (b = 0.14) and openness (b = −.2) were
both significant, with an increase in conscientiousness being associated with an
increase in acceptance of this statement. In contrast an increase in openness was
associated with a decrease in this acceptance of this statement. The remaining model of
[R3.2] I reject a group with a large size as it may not feel as a coherent group was not
significant.
[R4] Rejecting online peer groups due to unclear membership protocol. None of the
models under this category was significant, which were [R4.1a] I reject a group which
allows friends in real-life to join; [R4.1b] I reject a group which allows family
members to join; [R4.2a] I reject a group when members can leave the group anytime
without giving notice and explanation; [R4.2b] I reject a group when there are con-
ditions to exit the group, e.g. to tell the moderator in advance.
4.3 Discussion
In terms of acceptance factors, the majority of regression models were not significant,
with those that were only explaining a relatively small amount of the variance. The
significant predictors with such models were primarily personality traits, such as
extraversion and neuroticism. These occurred in the direction that could be as expected,
such as for example, an increase in extraversion being associated with acceptance of a
peer group to increase engagement in managing a wellbeing issue.
There were a greater number of significant regression models under the rejection
factors, although again when these were significant, they only accounted for a relatively
low amount of the variance. The greater number of significant models and predictors in
relation to rejection factors compared to acceptance may be a reflection of the reactance
effect [15], in which individuals respond negatively to being told that they are not
permitted to do something. Similar to the significant acceptance model personality
traits tended to be amongst the significant predictors. Gender was a significant predictor
in several models relating to group judgement, with female participants being found to
be more likely to reject statements that involved the possibility of social judgement.
Research into the gender and the use of peer groups has found that the relationship
between these can be complex [16]; however this result could be argued to be con-
sistent with the broad finding that females make greater use of social support structures.
This is because a peer group situation that includes explicit and trackable judgement of
others may be perceived to be a threat to group harmony, and therefore something

which may be likely to undermine or damage that social support network.
Culture was not a significant predictor in any of the significant regression models.
This was unexpected, as there are several cultural dimensions that could be relevant to
peer group structure and function. This includes dimensions such as power distance,
individualism and uncertainty avoidance [21]. This may suggest that online peer
support environments are not subjected to cultural influences in the same way as offline
groups, although research in both of these domains is limited. If culture is not an
influential factor on acceptance and rejection factors of online peer support groups then
this is an important consideration, as it suggests that strategies based around online peer
support may be transferable between cultures.
There is increasing societal concerns about the compulsive and excessive use of digital
technologies. These same technologies allow for prevention and intervention strategies
to be delivered in way that is faster and substantially less costly than traditional
strategies, but in order to make this meaningful we must better understand what factors
determine the acceptance and rejection of such approaches. In this paper, we studied
the effect of several personal and contextual factors on the acceptance and rejection of
the online peer support groups as a mechanism to enhance wellbeing. We took digital
wellbeing as a case study where both the behaviour and the behaviour change share the
same medium and where part of the behavior and performance towards behavioural
goals and limits can be tracked and monitored in part, i.e. the digital usage. There were
fewer differences than what we have expected if we consider similar research in the
context of social media. This would mean that online peer support groups, as a special
kind of social networks, would need to be thought as a purpose-driven gathering. For
example, accepting such a technique as an awareness tool and as an education tool was
little affected by personal and cultural differences. We, however indeed noted that
rejecting the groups for various reasons including being a medium of unmanaged and
loose interaction, with additional risks such as peer interactions tools being used for the
purpose of intimidation. This would again mean that people expect such groups to be
purpose-driven and reject their instantiation as ordinary social networks. As our find-
ings indicated, peer support groups are seen both as a motivational and educational tool
and hence game-based learning [14] can be a way to increase their acceptance. It is
important that further research is conducted within this emergent area, to ensure that
prevention and intervention strategies are informed by an evidence base.
Appendix A: Survey Questions Relevant to This Paper
Demographics, Perception of Peer Groups, Personality, and Self-control Questions

– What is the gender you identify yourself with? Male; Female; Prefer not to say.
– What is your main country?
– How do you see the usefulness of online peer support group as a method to help
members in managing their wellbeing issues? Very useful; Useful; Moderately
useful; Slightly useful; Not at all useful.
– Would you like to join an online peer support group to help you manage a well-
being issue? Very likely; Likely; Unlikely; Extremely unlikely.
– 10 Personality questions [20]: How well do the following statements describe your
personality? I see myself as someone who: is reserved; is generally trusting; tends
to be lazy; is relaxed; handles stress well; has few artistic interests; is outgoing,
sociable; tends to find fault with others does a thorough job; gets nervous easily; has
an active imagination.
– 13 Self-control questions [19]: Using the 1 to 5 scale below, please indicate how
much each of the following statements reflects how you typically are: I am good at
resisting temptation; I have a hard time breaking bad habits; I am lazy; I say
inappropriate things; I do certain things that are bad for me, if they are fun; I refuse
things that are bad for me; I wish I had more self-discipline; People would say that I
have iron self-discipline; Pleasure and fun sometimes keep me from getting work
done; I have trouble concentrating; I am able to work effectively toward long-term
goals; Sometimes I can’t stop myself from doing something, even if I know it is
wrong; I often act without thinking through all the alternatives.
Questions About Acceptance Factors (5 Points Likert Scale Reflecting Agreement
Degree)
[A1] Online peer support groups method is seen by some as an auxiliary mecha-
nism to ease and add more engagement to the management of the wellbeing issue.
Accordingly, the following features will increase my acceptance of them: [A1.1a]
Awards when achieving behavioural targets, e.g. points, badges, etc. [A1.1b] Awards
when making progress towards the behavioural target. [A1.2] Peer comparisons, i.e.
see how I and others are performing. [A1.3] Information and graphs how I am pro-
gressing to keep me engaged.
[A2] Online peer groups method is seen by some as an awareness tool to help raise
awareness and knowledge about the wellbeing issue and level of the problem.
Accordingly, the following features will increase my acceptance of them: [A2.1] Self-
Monitoring, e.g. showing your hourly, daily and weekly performance and progress
indicator. [A2.2] Peer comparisons, e.g. comparing you to other members in the group
who have similar profile and level of problem. [A2.3] Awareness on goal setting, e.g.
how to set and achieve goals, and how to avoid deviation from the plan you sat to
achieve them.
[A3] Online peer support group method is seen by some as an educational platform to
learn how to regulate the wellbeing issue and change behavior. Accordingly, the fol-
lowing features will increase my acceptance of them: [A3.1] Environment to learn from a
peers, e.g., by sharing real-life stories and successful strategies around the wellbeing
issue. [A3.2a] Environment to learn from experienced moderators, e.g. best practice
around the wellbeing issue. [A3.2b] Environment where I can learn through acting as a
mentor, i.e. when advising other members and when having to moderate the group. [A3.3]
Environment to learn how to set up achievable and effective goals and their plans.
[A4] Online peer support groups method is seen by some as a prevention and pre-
cautionary mechanism when the wellbeing issue starts to emerge. Accordingly, the
following features will increase my acceptance of them: [A4.1] Feedback messages
sent by peers about performance and wellbeing goals. [A4.2] Guidance, feedback and
information sent by moderators based on performance and achieving wellbeing goals.
[A4.3] Steps, restrictions and plans set by an authorised moderators, e.g. game usage
limit for compulsive gamers.
[A5] Online peer support groups method is seen by some as a support tool to guide,
motivate and encourage the recovery processes of the wellbeing issue. Accordingly, I
accept online peer groups as an: [A5.1a] Environment to provide experienced mod-
erators who are able to provide advice and guide members to manage the wellbeing
issue. [A5.1b] Environment to suggest alternative activities to replace and distance
myself from the negative behaviours and enhance wellbeing. [A5.2] Environment to
provide emotional support, e.g. when struggling to follow the healthy behaviour.
[A5.3a] Environment to get positive and motivational feedback when performing well.
[A5.3b] Environment to get positive and motivational feedback even when failing to
achieve targets. [A5.3c] Environment to issue warning feedback when members per-
formance and interaction are not right.
Questions About Rejection Factors (5 Points Likert Scale Reflecting Agreement)
[R1] Online peer groups method is rejected by some as it can be intimidating if
used in certain modalities. [R1.1a] I reject a group with negative feedback, e.g. you
have repetitively failed in achieving your target, this is the 5th time this month. [R1.1b]
I reject a group with harsh feedback, e.g. Your interaction with peers shows anti-social
and disruptive patterns. You have been reported for annoying others. [R1.2] I reject a
group with harsh penalties e.g. banning from the group for a period of time if I
repetitively forget my target.
[R2] Online peer group method is rejected by some when seen as overly judgmental.
[R2.1] I reject a group if the group moderator judges my performance and interaction
frequently, even if this is for my benefit. [R2.2a] I reject a group if I am judged by
peers who are only online contact, e.g. not real-life contacts. [R2.2b] I reject a group if
I am judged by online peers who are also real-world contacts. [R2.2c] I reject a group if
the judgment online expands to other life aspects by peers who are real-world contacts.
[R3] Peer group is rejected when seen as a medium for a loose and unmanaged
interaction. [R3.1a] I reject a group with a weak moderator, e.g. unable to stop or ban
members who are not adhering to the group norms. [R3.1b] I reject a group which
allows a loose and relaxed rules e.g. accepting conversations and interactions that are
not related to the wellbeing issue. [R3.2] I reject a group with a large size as it may not
feel as a coherent group.
[R4] Online peer support group method is rejected when the membership protocol is
unclear. Please indicate your opinion of the following: [R4.1a] I reject a group which
allows friends in real-life to join. [R4.1b] I reject a group which allows family
members to join. [R4.2a] I reject a group when members can leave the group anytime
without giving notice and explanation. [R4.2a] I reject a group when there are con-
ditions to exit the group, e.g. to tell the moderator in advance.
References
1. Hampton, K., Goulet, L.S., Rainie, L., Purcell, K.: Social networking sites and our lives.
Pew Internet Am. Life Proj. 16, 1–85 (2011)
2. Griffiths, M.: A ‘components’ model of addiction within a biopsychosocial framework.
J. Subst. Use 10(4), 191–197 (2005)
3. Widyanto, L., Griffiths, M.: Internet addiction: a critical review. Int. J. Mental Health Addict.
4(1), 31–51 (2006)
4. Winkler, A., Dörsing, B., Rief, W., Shen, Y., Glombiewski, J.A.: Treatment of internet
addiction: a meta-analysis. Clin. Psychol. Rev. 33(2), 317–329 (2013)
5. Webb, T.L., Sniehotta, F.F., Michie, S.: Using theories of behaviour change to inform
interventions for addictive behaviours. Addiction 105, 1879–1892 (2010)
6. Ali, R., Jiang, N., Phalp, K., Muir, S., McAlaney, J.: The emerging requirement for digital
addiction labels. REFSQ 9013, 198–213 (2015)
7. Alrobai, A., Phalp, K., Ali, R.: Digital addiction: a requirements engineering perspective.
Requir. Eng.: Found. Softw. Qual. 8396, 112–118 (2014)
8. Alrobai, A., McAlaney, J., Phalp, K., Ali, R.: Exploring the risk factors of interactive e-
health interventions for digital addiction. Int. J. Sociotechnol. Knowl. Dev. 8(2), 1–15 (2016)
9. Davidson, L., Chinman, M., Kloos, B., Weingarten, R., Stayner, D., Tebes, J.K.: Peer
support among individuals with severe mental illness: a review of the evidence. Clin.
Psychol. Sci. Pract. 6, 165–187 (2006)
10. Alrobai, A., McAlaney, J., Phalp, K., Ali, R.: Online peer groups as a persuasive tool to
combat digital addiction. In: International Conference on Persuasive Technology, pp. 288–
300 (2016)
11. Alrobai, A., Dogan, H., Phalp, K., Ali, R.: Building online platforms for peer support groups
as a persuasive behavior change technique. In: International Conference on Persuasive
Technology, pp. 70–83 (2018)
12. Braun, V., Clarke, V., Terry, G.: Thematic analysis. Qual. Res. Clin. Health Psychol. 24, 95–
114 (2014)
13. Alrobai, A.: Engineering social networking to combat digital addiction: the case of online
peer groups. Doctoral dissertation, Bournemouth University (2018)
14. Sousa, M.J., Rocha, Á., Game based learning contexts for soft skills development. In: World
Conference on Information Systems and Technologies, pp. 931–940 (2017)
15. Brehm, S., Brehm, J.: Psychological Reactance: A Theory of Freedom and Control.
Academic Press, New York (1981)
16. Matud, M.P.: Structural gender differences in perceived social support. Pers. Individ. Differ.
35, 1919–1929 (2003)
17. Hsiao, K.L., Shu, Y., Huang, T.C.: Exploring the effect of compulsive social app usage on
technostress and academic performance: perspectives from personality traits. Telemat. Inf.
34, 679–690 (2017)
18. Aldhayan, M., Cham, S., Kostoulas, T., Almourad, M.B., Ali, R.: Online peer support
groups to combat digital addiction: user acceptance and rejection factors. In: World
Conference on Information Systems and Technologies, pp. 139–150 (2019)
19. Tangney, J.P., Baumeister, R.F., Boone, A.L.: High self-control predicts good adjustment,
less pathology, better grades, and interpersonal success. J. Pers. 2(72), 271–324 (2004)
20. Rammstedt, B., John, O.P.: Measuring personality in one minute or less: a 10-item short
version of the Big Five Inventory in English and German. J. Res. Pers. 1(41), 203–212
(2007)
21. Hofstede, G.H., Hofstede, G.J., Minkov, M.: Cultures and Organizations. Software of the
Mind, vol. xiv, 3rd edn, p. 561. McGraw-Hill, New York (2010)
Assessing Daily Activities Using a PPG Sensor
Embedded in a Wristband-Type Activity
Tracker
Alexandra Oliveira1,2(&) , Joyce Aguiar1,3 , Eliana Silva1,

Brígida Mónica Faria1,2 , Helena Gonçalves4, Luís Teófilo1,5,
Joaquim Gonçalves6,7 , Victor Carvalho6,
Henrique Lopes Cardoso1,5 , and Luís Paulo Reis1,5
1
LIACC - Artificial Intelligence and Computer Science Laboratory,
Porto, Portugal
{aao,elianasilva,lteofilo,hlc,lpreis}@fe.up.pt
2
ESS-IPP - School of the Polytechnic Institute of Porto, Porto, Portugal
[email protected]
3
CPUP- Center for Psychology at University of Porto, Porto, Portugal
[email protected]
4
UM - University of Minho, Guimarães, Portugal
[email protected]
5
DEI - FEUP - Department of Informatics Engineering,
Faculty of Engineering of the University of Porto, Porto, Portugal
6
LITEC - Technological Innovation and Knowledge Engineering Laboratory,
Optimizer, Porto, Portugal
[email protected]
7
2Ai - Polytechnic Institute of Cávado and Ave, Barcelos, Portugal
[email protected]
Abstract. Due to the technological evolution on wearable devices, biosignals,

such as inter-cardiac beat interval (RR) time series, are being captured in a non-
controlled environment. These RR signals, derived from photoplethysmography
(PPG), enable health status assessment in a more continuous, non-invasive, non-
obstructive way, and fully integrated into the individual’s daily activity. How-
ever, PPG is vulnerable to motion artefacts, which can affect the accuracy of the
estimated neurophysiological markers. This paper introduces a method for
motion artefact characterization in terms of location and relative variation
parameters obtained in different common daily activities. The approach takes
into consideration interindividual variability. Data was analyzed throughout
related-samples Friedman’s test, followed by pairwise comparison with Wil-
coxon signed-rank tests with a Bonferroni correction. Results showed that
movement, involving only arms, presents more variability in terms of the two
analyzed parameters.
Keywords: PPG signals Daily life Human activity detection Sensory

instrumentation Photoplethysmography Motion artifacts Heart rate
Sensor-based applications
Assessing Daily Activities Using a PPG Sensor Embedded 109
1 Introduction
When the heart beats, capillaries expand and then contract based on blood volume
changes. These changes can be inferred from a photoplethysmogram (PPG), which is a
register over time of the amount of light absorbed and reflected by the tissues when
illuminated by a pulse oximeter [1]. Photoplethysmography is an optical method used
to determine a wide range of physiological processes and vital biosignals such as blood
glucose levels, blood pressure and heart rate (HR) [2–4].
Due to the technological evolution of mobile wearable devices, PPG-derived
biosignals can be captured in a more continuous, non-invasive and non-obstructive way
and perfectly integrated into the individual’s daily lives. Despite that, this new para-
digm is creating scenarios where biosignals are collected without the scope and control
imposed by clinical rules and environments [5]. So, new information can be introduced
but also noise and artifacts, in particular, motion artifacts. Motion artifacts usually
affect the normal PPG signal by deviating it from the baseline or promoting large
fluctuations [6]. The presence of artifacts can highly distort PPG-derived biosignals
and, in particular, HRV indices that are afterward used for assessing the mechanism of
the rhythm of the heart and general health condition of the individuals.
When monitoring the heart, many wearable devices do not provide the original
PPG signal, but rather a transformation of it resulting in a signal composed by the time
differences between consecutive heartbeats - the RR signal. To detect the presence of
motion artifacts, as well as to characterize them, is not only vital for real-time use of the
data, but also non-trivial [5]. However, and to the best of our knowledge, the presence
of motion artifacts in the RR signals derived from a processed PPG is not usually
explored. Addressing this gap in the literature, this study proposes to simulate a natural
environment where individuals are invited to perform low impact activities for a short
period of time while wearing a PPG wristband sensor and then compare the resulting
biosignals of each activity based on two quantitative information metrics: location and
relative variation, taking into consideration interindividual variability. The importance
of interindividual variability relies on the fact that individuals have different physio-
logical baselines as well as different reactions to the transillumination of the skin,
resulting in different morphologies of the PPG signals and derived RR signals.
Moreover, we propose a practical protocol for daily human activity for defining a
ground-truth data in a natural environment.
This paper firstly revises, in Sect. 2, the state-of-the-art for artifact detection on
PPG signals. Section 3 presents the proposed protocol to assess RR sensorial data for
meaningful daily activities, as well as the description of the statistical analysis. In
Sect. 4 we present the main findings. Finally, we conclude and point to future work in
Sect. 5.
110 A. Oliveira et al.
2 Related Work
In the literature, some studies proposed different techniques to identify and reduce the
presence of artifacts on PPG signals, such as time- and frequency-domain filtering,
power spectrum analysis, blind source separation techniques, multipath, wavelets,
support vector machines (SVMs) and multichannel analysis [7–13].
In a recent study [8], Ban and Kwon proposed an algorithm to identify and mitigate
the impact of mobility on PPG measurements using multipath signals and wavelet.
They measured PPG signals at different locations on the body, but the type of
movement and the definition of the noise is not clear. Dao et al. [9] studied an approach
based on the time-frequency spectrum of the PPG to detect and determine motion
artifacts and establish the usability of the PPG segment. They compared several
datasets, collected with different, strict and limited protocols with devices mounted in
the forehead and in one finger. Also, they used human visual inspection to identify
motion and noise artifacts. Zhang et al. [10] developed a modular algorithm framework
for motion artifact removal from signals captured with wrist-worn PPG sensors. They
used a multichannel PPG and data collected from accelerometer and gyroscope in order
to account for the variability in the signal. A small dataset was used, related to per-
forming several macro and micro motions such as fist opening and closing. However,
accelerometers consume too much energy for being useful in long records, and the
typical daily activities were not considered in the study. Vandecasteele et al. [11] used a
multichannel approach and visual inspection of the records for determining the pres-
ence of motion artifacts in PPGs, and then used SVMS for classifying PPG segments
into normal or noisy. Cherif et al. [7] used a method based on waveform morphology
for detecting artifacts in PPG signals. They took into consideration the interindividual
and measurement conditions variabilities, but the type of induced motion artifact
protocol used is not clear. Zhao et al. [12] collected data while individuals were
running on a treadmill at a speed of 12 km/h and used a multichannel approach for
detecting motion artifacts. Since individuals were subjected to an intensive exercise, the
adaptive response of the heart to the external stimulus was not controlled and so it is
difficult to analyze the corresponding RR signal. Tabei et al. [13] analyzed PPG signals,
derived from smartphones, accounted for interindividual variability using probabilistic
neural networks with several extracted parameters and compared its performance with
other detection algorithms. Hand movement, fingertip misplacement, and lens-pressing
motion artifacts were considered.
3 Methodology
The choice of activities encompassed by the protocol used in this study was performed
in two steps. The first, a literature review was performed in order to identify an easy,
reliable and practicable measure of daily activities. Following these criteria, the liter-
ature’s findings suggested the use of a generic health 36-item survey focused on daily
activities - the Short-Form Health Survey [14]. SF-36 is a coherent, and easily
administered quality-of-life measures survey, widely used by health professionals and
researchers around the world. It is composed of 36 items, each measured in a Likert
scale that ranges from 3 points to 6 points. Except for the one single-item of self-
evaluated transition (HT), the scores of the other 35 items are grouped into 8 multi-item
scales, including physical functioning (PF), limitations due to physical health problems
– role-physical (RP), bodily pain (BP), general health (GH), vitality (VT), social
functioning (SF), limitations due to emotional health problems (RE) and mental health
(MH) [14]. In our study, the default 4-week recall form was used. This survey was also
used for assessing the general health condition of the participants and for eliminating
some possible confounding effects. The second step consisted of some informal
interviews with a subsample of our target group, with the main goal of understanding
which activities were most common and crosscutting. In addition, interviews focused
on daily, often low-impact activities to minimize the effects on heart rate during the
performance.
Biometric data, namely RR signals, were measured using a commercial device - the
Microsoft Band®. The interest in this device, besides allowing the measurement of
biometric, is associated with its portability and low weight, being non-invasive for the
user, discreet and comfortable, and having a good quality/price relation [15, 16]. Prior
to the data collection process, all participants were provided with an informed consent
and data protection norms. The smart band was securely mounted on the dominant
hand. The protocol was formed by a sequence of activities, which can be grouped into
activities involving body movement with or without changing position in the room
(Fig. 1).
Fig. 1. Activities to perform: (a) walk 3 m at a comfortable speed, in a straight line from point A
to point B; (b) lift from a chair (point A) after 30 s sitting and walk a distance of 3 m at a
comfortable speed straight to point B; (c) from point A, walk 3; (d) tilting, (e) carry weights.
The first group of activities involved body movement such as: walking for 3 m at a
comfortable speed, in a straight line; walking 3 m at a comfortable speed, in a straight
line with the task of lifting, transporting and landing weight (a bag); lifting from a chair
after 30 s sitting and walking 3 m at a comfortable straight speed; walking 3 m at a
comfortable speed in a straight line and sit for 30 s. The second group of activities
involved body movement but not position change such as: tilting, lowering/crouching,
kicking, dressing/undressing a coat, eating, drinking, reading a

magazine/newspaper/book, simulate to coughing, simulate to sneezing, and talking on
the phone. After completing all sequences, participants raised the two hands in the air
above the head. The tests had an average duration of 5 min. All activities were grouped
according to the major body parts involved in the movement, that is: without move-
ment (standing and talking on the phone); with only arm movement (drinking, eating,
reading, lift the two hands in the air, dressing and undressing); moving in a straight line
(walking and walking with weights); with only leg movement (crouching, kicking and
sitting); and with chest, arms and legs movement (tilting).
Social-demographic data and general health data was evaluated throughout stan-
dard descriptive analyses. For each participant and each activity, the average and
relative variation coefficient of RR signal given by the Microsoft Band was considered,
without any pre-processing. The distributions of these two metrics were described
through the use of boxplot charts and the comparison of the amount of variability
across the different activities was performed by the use of related-samples Friedman’s
two-way analysis of variance (after the verification of the non-normality assumptions).
The further pair-wise comparisons for detecting significant different activities were
performed with Wilcoxon signed-rank tests with a Bonferroni correction. All analyses
were performed with R version 3.4.4.
The demographic characteristics of the participants are listed in Table 1. Participants

were mostly male (66%), and with ages ranging from 20 to 44 years old (M = 30.19,
SD = 7.56). The body mass composition was on average normal with IMC of 23.57
(SD = 3.31) kg/m2. The majority of participants were single (76%), with a college
degree (45%), a full-time occupation (85%) and employed (66%). The majority of
participants (32%) were unsatisfied with the amount of sleep they are getting but it is
worth to mention the considerable number of participants that were very satisfied with
the amount of sleep (16%) and that were neither satisfied or unsatisfied (18%). More
than half of the participants had a physical activity routine (53%) and the large majority
were non-smokers (77%). A minority of the participants have at least one chronic
disease but none of which is heart-related nor compromised the realization of the
proposed activities (Table 1). So, all participants were young healthy adults with good
physical condition and with no past history of cardiovascular disease.
Considering the eight numeric subscales of the SF36 questionnaire (see Fig. 2), it
can be seen that, in general, participants enjoy good health conditions. The higher mean
was obtained on Physical Functioning (PF) (96.71 ± 5.61), and the lower score was
registered on Vitality (VT) subscale (60.53 ± 21.21). It is important to notice the
higher value of the standard deviation of the latter subscale, indicating a high
heterogeneity of the participants in this domain. General Health (GH) and Mental
Health (MH) have also relatively lower mean scores (73.53 ± 17.32 and
75.26 ± 15.20, respectively).
Considering the distribution of the average values of RR per activity (Fig. 3), the
more homogeneous distribution is observed in the standing position. The activity of
Table 1. Demographic characteristics of the participants.

Characteristics n (%)
Sex Male 25 (66)
Female 12 (32)
Other/Unknown 1 (2)
Marital status Married 5 (13)
Single 28 (74)
Other/Unknown 4 (11)
Education University/College 17 (45)
Masters 12 (32)
PhD 9 (24)
Sleep satisfaction Very Unsatisfied 1 (2.6)
Unsatisfied 12 (31.6)
Neither unsatisfied nor satisfied 7 (18)
Satisfied 12 (31.6)
Very Satisfied 6 (16)
Physical activity Yes 20 (53)
No 18 (47)
Smoking habits Smoker 5 (13)
Non-Smoker 29 (76)
Ex-Smoker 4 (11)
Chronic diseases Yes 6 (16)
No 32 (84)
Fig. 2. Mean ± Std. Dev for each of SF36 domains.
coughing is fairly instantaneous and so the distribution of values is similar to the one
observed in the standing position. Sneezing is also instantaneous but, usually, is a
forceful expulsion of air resulting in more movement, in turn bringing a more
heterogeneous distribution of average values. Talking on the phone, drinking, sneezing,
reading, and rising the two hands in the air are the activities with more dispersion of
average values and, therefore, more variability. It is also important to notice that
walking, walking with weights and crouching narrow boxes indicate more homo-
geneity in the distribution of average RR values. Also, from Fig. 3 it can be seen that
when compared to the standing position, the activities of drinking, sneezing and
reading have lower median. On the contrary, eating, coughing, reading, and sitting
have higher median.
Fig. 3. Distribution of the RR average signal per activity type
Taking into consideration the distribution of the relative variation coefficient

parameters across all the activities (Fig. 4), the largest median is observed in drinking,
sneezing and reading activity. It is also important to notice the low median observed in
dressing/undressing movement, crouching, standing, and coughing. Moreover, the
distribution of the relative variability coefficient is more heterogeneous in the talking,
drinking, eating, sneezing, reading and rising two hands above the head.
Fig. 4. Distribution of the RR variation coefficient signal per activity type.

Clustering the activities according to the type of movement and considering the
distribution of the average values of RR (see Fig. 5), it can be seen that movements
involving legs have a higher median and the tilting activity has more heterogeneous
observations. Comparing the average values of RR obtained in activities without
movement, the activities involving only arms have lower median but higher hetero-
geneity, similar to the tilting activity. The average values of RR had a higher median
and higher variability in the walking in a straight line activity.
Fig. 5. Distribution of the average of the RR signal per activity type
When the movement involves the use of the arms, the median of the relative
variation coefficient increases. Comparing with the activities without movement, the
heterogeneity of the distribution of the relative variation coefficient is higher on the
walking in straight line activity, activities involving only legs movements and tilting
(Fig. 6).
Fig. 6. Distribution of variation coefficient of the RR signal per activity type

Considering only groups composed of more than one activity, the median
(Interquartil Range - IQR) of averaged RR had a minimal median of 0.676 (0.099) in
the activities with only arms movements, and maximum median of 0.752 (0.116) and
0.776 (0.097) in the activities of walking in a straight line and activities involving only
leg movements, respectively. Considering the relative variation coefficient of the RR
signal, the minimum median (IQR) was observed in the activities with leg movements,
0.170 (0.100), and maximum median in activities with only arms movements 0.224
(0.067). From Friedman analysis, it was found that at least one group of activities was
statistically significantly different from the others in terms of average and in terms of
relative variation coefficient (v2(3) = 23.554, p < 0.001; v2(3) = 16.157, p = 0.001)
(Table 2).
Table 2. Related-Samples Friedman’s analysis by ranks of average and variation coefficient

Criteria Activity group n Median IQR v2ð3Þ Sig
Average Without movement 29 0.720 0.079 23.55 <0.001
With only arms movement 29 0.676 0.099
Walking in straight line 29 0.752 0.116
With only leg movements 29 0.776 0.097
Variation coefficient Without movement 28 0.208 0.034 16.16 0.001
With only arms movement 28 0.224 0.067
Walking in straight line 28 0.190 0.110
With only leg movements 28 0.170 0.100
The post hoc analysis with Wilcoxon signed-rank tests was conducted with a
Bonferroni correction for detecting which group is different for the two parameters. It
was found that there were significant differences between the activities involving arms
and activities involving legs (Z = −1.586, p < 0.001) and between activities involving
arms and moving in a straight line (Z = −1.121, p = 0.006). Despite an overall increase
in the median of the average RR in walking in a straight line and with only leg
movements versus the standing median, no significant differences were detected (see
Tables 2 and 3). Also, it was found that there were significant differences between the
activities involving arms and activities involving legs (Z = 1.25, p = 0.002) and
between activities involving arms and moving in a straight line (Z = 1.143, p = 0.006).
Despite an overall decrease in the median of the variation coefficient of the RR in
walking in a straight line and with only leg movements versus the standing median, no
significant differences were detected.
Table 3. Pairwise comparisons of RR average and variation coefficient RR

Criteria Activity groups pairs Test Sig. Adj.
statistic Sig.
Average Without vs With only arm movements 0.741 0.029 0.173
Without vs Moving in a straight line −0.379 0.263 1
Without vs With only legs movements −0.845 0.013 0.76
With only arms vs Moving in a straight −1.121 0.001 0.006
line
With only arm vs With only legs −1.586 <0.001 <0.001
movement
Moving in a straight line vs With only −0.466 0.17 1
legs movements
Variation Without vs With only arm movements −0.75 0.03 0.178
coefficient Without vs Moving in a straight line 0.393 0.255 1
Without vs With only legs movements 0.5 0.147 0.884
With only arm vs Moving in a straight 1.143 0.001 0.006
line
With only arm vs With only legs 1.25 <0.001 0.002
movements
Moving in a straight line vs With only 0.107 0.756 1
legs movement
As a non-invasive and low-cost technique, PPG has been widely implemented in a

number of wearable devices, such as smart bands, smartwatches, and smartphones.
With the increasing demand to provide healthcare solutions, in particular heart moni-
tors, several companies have been working on improving those types of equipment in
order to provide more reliable health-related metrics. Following the technological
evolution, bio-data are being collected in a more integrated way in daily life protocols,
but also with a greater probability of corruption from motion artifacts. In this work, we
used a commercial device to analyze the impact of a RR signal derived from a PPG
signal on daily activities, in a controlled environment, but mimicking some of the most
common daily activities. The impact was characterized in terms of average values but
also in terms of relative variation of the signals. The approach took also in consider-
ation the interindividual variability when comparing across activities.
The general health status of the participants was assessed (throughout the evalua-
tion of the quality of life) in order to control some confounding effects over de RR
signals.
The results showed that, when compared to the standing position, the activities of
drinking, sneezing, and reading have a lower median of the average RR parameter, but
in terms of relative variance coefficient they had the largest medians. Considering the
activities grouped according to the major muscles involved in the movement, the
results showed that in terms of the location parameter, the activities with only arm
movements had a lower median. In terms of the relative variance parameter, the lower
median was observed in the group of activities with leg movements. The results also
showed that at least one group of activities was statistically significantly different from
the others in terms of both analyzed criteria. The differences are more relevant when the
motion involves only arms, such as drinking and lifting the hands in the air, a kind of
movement that increased the variability of the observed values. Describing the different
kinds of motion artifacts presented in a bio-signals can be beneficial for its detection in
real-time. Therefore, collecting ground-truth daily human activity data is essential for
retrieving useful information. This is not trivial since the rhythm of the heart is not a
stationary signal, being instead highly influenced by external factors. Also, PPG
morphology varies among individuals.
This paper may contribute to detect unreliable data that should be discarded to
prevent inaccurate decision making and false alarms. As future work, we propose to
evaluate the impact of micromotion such as writing on a computer and to analyze the
presence of artifacts in a non-controlled environment since RR readings can be
influenced by other factors.
Acknowledgements. This work was supported by the European Regional Development Fund
through the programme COMPETE by FCT (Portugal) in the scope of the project PEst-
UID/CEC/00027/2015 and QVida+: Estimação Contínua de Qualidade de Vida para Auxílio
Eficaz à Decisão Clínica, NORTE010247FEDER003446, supported by Norte Portugal Regional
Operational Programme (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement. It
was partially supported by LIACC (FCT/UID/CEC/0027/2020).
References
1. Shelley, K., Shelley, S., Lake, C.: Pulse oximeter waveform: photoelectric plethysmography.
In: Lake, C., Hines, R., Blitt, C. (eds.) Clinical Monitoring, pp. 420–428. WB Saunders
Company (2001)
2. Rapalis, A., Janušauskas, A., Marozas, V., Lukoševičius, A.: Estimation of blood pressure
variability during orthostatic test using instantaneous photoplethysmogram frequency and
pulse arrival time. Biomed. Sig. Process. Control 32, 82–89 (2017)
3. Gil, E., Orini, M., Bailon, R., Vergara, J.M., Mainardi, L., Laguna, P.: Photoplethysmog-
raphy pulse rate variability as a surrogate measurement of heart rate variability during non-
stationary conditions. Physiol. Measur. 31(9), 1271 (2010)
4. Höcht, C.: Blood pressure variability: prognostic value and therapeutic implications. ISRN
Hypertension (2013)
5. Rodrigues, J., Belo, D., Gamboa, H.: Noise detection on ECG based on agglomerative
clustering of morphological features. Comput. Biol. Med. 87, 322–334 (2017)
6. Sun, B., Wang, C., Chen, X., Zhang, Y., Shao, H.: PPG signal motion artifacts correction
algorithm based on feature estimation. Optik 176, 337–349 (2019)
7. Cherif, S., Pastor, D., Nguyen, Q.-T., L’Her, E.: Detection of artifacts on photoplethys-
mography signals using random distortion testing. In: 2016 38th Annual International
Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 6214–
6217. IEEE (2016)
8. Ban, D., Kwon, S.: Movement noise cancellation in PPG signals. In: 2016 IEEE
International Conference on Consumer Electronics (ICCE), pp. 47–48. IEEE (2016)
9. Dao, D., Salehizadeh, S.M., Noh, Y., Chong, J.W., Cho, C.H., McManus, D., Darling, C.E.,
Mendelson, Y., Chon, K.H.: A robust motion artifact detection algorithm for accurate
detection of heart rates from photoplethysmographic signals using time–frequency spectral
features. IEEE J. Biomed. Health Inf. 21, 1242–1253 (2017)
10. Zhang, Y., Song, S., Vullings, R., Biswas, D.: Simão: motion artifact reduction for wrist-
worn photoplethysmograph sensors based on different wavelengths. Sensors 19, 673 (2019)
11. Vandecasteele, K., Lázaro, J., Cleeren, E., Claes, K., Van Paesschen, W., Van Huffel, S.,
Hunyadi, B.: Artifact detection of wrist photoplethysmograph signals. In: BIOSIGNALS,
pp. 182–189 (2018)
12. Zhao, J., Wang, G., Shi, C.: Adaptive motion artifact reducing algorithm for wrist
photoplethysmography application. In: Biophotonics: Photonic Solutions for Better Health
Care V. p. 98873H. International Society for Optics and Photonics (2016)
13. Tabei, F., Kumar, R., Phan, T.N., McManus, D.D., Chong, J.W.: A novel personalized
motion and noise artifact (MNA) detection method for smartphone photoplethysmograph
(PPG) signals. IEEE Access 6, 60498–60512 (2018)
14. Ware, J.E., Kosinski, M., Bjorner, J.B., Turner-Bowker, D.M., Gandek, B., Maruish, M.E.,
et al.: User’s manual for the SF-36v2 health survey. Quality Metric Lincoln (2008)
15. Fan, S., Zhang, W., Hu, L., Chen, S., Xiong, J.: Research on the openness of microsoft band
and its application to human factors engineering. Proc. Eng. 174, 425–432 (2017)
16. Nogueira, P., Urbano, J., Reis, L.P., Cardoso, H.L., Silva, D.C., Rocha, A.P., Gonçalves, J.,
Faria, B.M.: A review of commercial and medical-grade physiological monitoring devices
for biofeedback-assisted quality of life improvement studies. J. Med. Syst. 42(6), 101 (2018)
Simulation of a Robotic Arm Controlled
by an LCD Touch Screen to Improve
the Movements of Physically Disabled People
Yadira Quiñonez1, Oscar Zatarain1, Carmen Lizarraga1, Juan Peraza1,

Rogelio Estrada1, and Jezreel Mejía2(&)
1
Universidad Autónoma de Sinaloa, Mazatlán, Mexico
{yadiraqui,carmen.lizarraga,jfperaza,
restrada}@uas.edu.mx, [email protected]
2
Centro de Investigación en Matemáticas, Zacatecas, Mexico
[email protected]
Abstract. This research is focused to help people who have problems to move
their bodies or do not have enough force to move it and support them in their
quotidian live to they have an easier way to reach objects with an easier control
which move a robot arm faster. To achieve this, this article presents a proposed
algorithmic that allows design a new way of mechanism on robotics arm with
three rotations joins which allows makes it is faster and adjustable. This pro-
posed algorithm includes a new form to get the kinematics of an anthropo-
morphic robot using an LCD touch screen and a new way to control a robotic
arm with only one finger, with less effort and touching it. This algorithm was
tested in Matlab to finding the faster way to get a point and was tested in
Arduino to prove it with the pressure sensor on the LCD touch screen.
Keywords: Robotics arm Assistive robotic manipulators Assistive

technology Physically disabled people LCD touch screen
1 Introduction
In the last two decades, the applications of robotic arms have helped in many areas,
human has been supported by these applications, especially in the automation area and
therefore in the industry, as well as those have helped in the quotidian life, mainly for
physically disabled people. In the last two decades, automation and control have
become a topic of interest for researchers from different areas. Mainly, in the industrial
robotics [1, 2] and the robotic systems applied in the medical area, such as tele-operated
surgery [3], surgical pattern cutting [4]. According to Shademan et al., the precision of
an industrial robot offers great advantages in this area [5]. Currently, there are endless
technological developments and various applications such as prostheses [6], orthoses
[7], exoskeletons [8–10], and devices for teleoperation in order to improve human
capabilities [11–14]. According to Chung et al. conducted a review of the literature
about the different robotic assistance manipulators from 1970 to 2012, mentioning that
Simulation of a Robotic Arm Controlled by an LCD Touch Screen 121
reliability, profitability, appearance, functionality, and ease of use are determining

factors to achieve successful commercialization [15].
Some related works for the rehabilitation of patients with reduced movement
capacity have used virtual reality to improve the manipulation of movements and
positively influence a virtual environment [16–19]. Recently, in a work by Atre et al.
have used techniques of computer vision to create a robotic arm controlled by gestures,
they mention the importance that a robotic arm is more efficient in relation to accuracy
and speed [20]. In another work, Badrinath et al. propose the control of a robotic arm
using a Kinect 3-D camera to track the movements of people through computer vision
technique [21]. In this context, the paper aim is to propose an algorithm to simulate the
kinematic to control an anthropomorphic robotic arm with an LCD touch screen which
provides intuitive control for the patient and a faster kinematic for a robot arm with a
peculiar system mechanics for the base of the robot arm, this solution has been chosen
because of implementation is very easy to employ even for any person, with one finger
that touches a coordinate in the LCD Touch screen, it is going to provoke a kinematic
on the anthropomorphic arm in an easy, intuitive, strengthless and comfortable way.
The remainder of this paper is organized as follows: Sect. 2, provides some
background information concerning algorithm description and the conditions related to
the algorithm. Section 3, describes how is the algorithm implemented and examples
and preliminary results concerning it. Section 4, shows the algorithm simulation in
Matlab with all of the characteristics and its functions. Section 5, the results, and
designs land on the code using the Arduino Mega microcontroller are showed. Finally,
Sect. 6, summarizes the conclusions of the paper and indicates further work.
2 Algorithm Description
It is common using a joystick to control the movements of a robotic arm, in fact, the
joystick is implemented in many sorts of electronic applications and one of these is to
control a robotic arm that gives assistant to physically disabled people. Nevertheless, a
joystick can require more effort and wait time by the patient and it could be possible
that the patient is a diabetic person or has another disease which can cause sudden
weakening or bad/difficult coordination. As it is known, using a joystick, sometimes, it
could be required more than two long movements, doing that the joystick is not an
efficient control for specifics problem. In a work by Chung et al. mention that the
operation of the robotic arm is controlled by holding and pulling the joystick to reach
the desired position, on another hand, however, when using an LCD screen it is only
necessary to touch the LCD screen [15]. Therefore, it has been thought in a new form to
control a robotic arm such as cellphones that have an intuitive touch control to
manipulate its functions. The new shape to control a robotic arm takes this idea,
making a touch control which is an LCD touch screen with an easy and intuitive
interface to use the control and manipulate the robotic arm.
Implementing this control, the physically disabled people can get a point on the
space by making a slight movement with one finger, covering all the necessities in the
best way. Moreover, comfortable manipulation is very important to the patient and with
this algorithm which is programmed for LCD touch screen and Arduino. It is also
122 Y. Quiñonez et al.
thought about the extreme cases, for example, a physically disabled people who are
diseases such as diabetes that suffer by hypoglycemia or lupus which are diseases that
make the patient lose the awareness, the force or bad coordination according to
background experience based on surveys for physically disabled people. Moreover, it is
important to mention that people with diabetes or lupus can have confusion during a
shock, therefore, making one movement with the finger touching a point in the LCD
touch screen can save lives because this move will allow the robotic arm moves quickly
in an emergency case.
The algorithm that will be presented in the next section, it finds the best way to get
a point in the space which is touched in the LCD touch screen by the physically
disabled people. This algorithm reaches the position, moving by PWM Pulses and it is
moved to get a specific number of degrees, depending on the Pulses width and the
coordinate which has been touched on the LCD touch screen. It makes a natural way to
control because of a robotic arm movement is based on the coordinate that the finger
has touched and does not need much concentration, unlike the methods that are con-
trolled through the thoughts sensor. This algorithm has been proved in Arduino and
simulated in Matlab, besides it has been designed a mechanism to the best function of
the algorithm to work with a system with 2 dimensions which is the LCD touch screen
and be able to move the anthropomorphic robotic arm in a plane of 3 dimensions.
The LCD touch screen has two dimensions which are referred as x and y, having the
coordinate ðx; yÞ it is not a problem because these can be obtained on the LCD touch
screen, but it is necessary to have another dimension, this dimension is calculated from
the coordinate x and y, and represent the rotation to reach the coordination ðx; yÞ, then
these two numbers (x and y) are divided and put them as a domain of a trigonometric
function. The first condition of using this algorithm is to take x and y as two legs of a
right triangle, to use a tangent function. Therefore, it can be obtained a as the next way:
y
a ¼ tang ð1Þ
x
where x, is a number different to zero. Then, this opens the first real condition of this
algorithm.
2.1 Description of the Conditions
First Condition. For representing a coordinate that is touched in the LCD touch
screen, it is had a coordinate in a point on the screen and to get this point is necessary
that anthropomorphic robotic arm’s first joint rotates some degrees. Then, when x y,
it can be got coordinate located equal or less than 45°, such as it is shown below.
y y y
a 45 V tan1 ð Þ 45 V tanð45Þ V 1 ð2Þ
x x x
therefore, y x this means that ny nx which represent the pulses to move on axis y
and axis x respectively. For programming, pulses conditions of axis x and y for when a
is less or equal than 45°: if a ¼ 45 then nx ¼ ny and if a \ 45 then nx [ ny .

Second Condition. As it is very known when a is equal to 45°, then yx ¼ 1, but what
y
happens when x approaches 1? The following happens:
y
limðyÞ!1 a ¼ limðyÞ!1 tan1 ¼ tan1 ð1Þ ¼ 45 ð3Þ
x x x
That is very known, nevertheless, something that is very important

happens and it helps
to locate a when the LCD touch screen is touched. For when yx is approximated 1 by right,
it is not fulfilled the first condition, due to y is greater than x, therefore 90 a 45.
Third Condition. In this condition, x always will be negative on the touch screen,
therefore, the rule which is going to build, it is the next:
90 a þ 180 135 ð4Þ
Fourth Condition. This condition is somewhat similar to the third condition, except
that it has the feature j xj j yj and therefore:
135 a þ 180 180 ð5Þ
Figure 1 shows all conditions, which draws an area as a right triangle where it is
fulfilled that x y at the first case, y x at the second case, y x at the third case
and j xj j yj at the fourth case.
Fig. 1. Representation of the conditions: condition 1, right triangle where y x, therefore

a 45. Condition 2, right triangle where y x, therefore 90 a 45. Condition 3, right triangle
area where 90 a þ 180 135. Condition 4, right triangle area where 135 a þ 180 180.
3 Algorithm Implementation
This algorithm is implemented on the LCD touch screen to moves three different
steppers motor, each one has 1.8° of resolution which is reached by a 10 ms PWM
Pulse, therefore the a motor reaches 180° in one second, plus the 5 ms of waiting of
each pulse, then the function which can explain the trajectory of the a motor using the
pulses domain is the next:
f ðnÞ ¼ 1:8n ð6Þ
To reach 180° it is needed to send 100 pulses to the function (6), therefore in one
second it can be reached 180° and it will be the maximum theory time to work. then,
the discrete function (6) can work well, because it is very precise. However, pro-
gramming this function in the microcontroller and using the LCD touch screen, this can
give us some troubles, due to the a motor is programming different to the motors which
give the coordinate ðx; yÞ and to program the a motor with just this function can give a
saturation of pulses or rather a “traffic of pulses”. Therefore, control the three motors
could be a slow way to move the robotic arm, also with the traffic pulses can be that the
motors which give the point ðx; yÞ do not reach the coordinate which has been
requested. Then, it has been proposed another function, which can reach big degrees
measure in less time with few pulses. This function is the next:
Xs
f ðsÞ ¼ n¼0
1:8n ð7Þ
Although the function (7) can be no precise for some degree’s measures, this
discrete function is very easy to program and mix it with the function (6) can be very
useful for the job of a motor. Then, the function (7) need a limit superior s ¼
f1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13g to can join it with the function (6).
Each pulse has a width of 10 ms, while n increase until gets the s value, the pulse
will be increased its width 10 to 10 ms, such as it is shown in Table 1.
Table 1. n pulses width of time in the summation function f ðnÞ until n gets each value of s.
f ðsÞ Pulses s Time
1.8 1 10 ms
5.4 2 20 ms
10.8 3 30 ms
18 4 40 ms
27 5 50 ms
37.8 6 60 ms
50.4 7 70 ms
64.8 8 80 ms
81 9 90 ms
99 10 100 ms
118.8 11 110 ms
140.4 12 120 ms
163.8 13 130 ms
3.1 Algorithm Rendition

In this case, the two discrete functions are worked by the algorithm and it tries getting
the best way to find the road to reach the measure of the desired degrees, then, the
number of the combination is represented as 2n forms, each way can have different time
and sometimes some combinations have the same measure of time to reach the desired
degrees. Moreover, some ways can be more exact than other ones but slower, never-
theless, the best way is founded by the algorithm and can anticipate the location time in
which it is found.
2n
a limts !b a b ð8Þ
ts
The number a represents the number of measures of the desired degrees, the
number b helps to have an interval to find the limits time (superior or inferior limit
time) and helps to the inequation is fulfilled and 2n is the number of combinations to
search. To develop the algorithm, it is important to know that a\2n and b can have
values minor or equals than ð2n =aÞ or it can be rounded out as long as the in equation is
fulfilled. Then, when the value of b is known, it is equaled as the ratio of the limit
superior time between the limit inferior time, such as it is shown next.

tp
b¼ ð9Þ
td
where tp is the limit superior time and td is the limit inferior time. When it is known the
limits time, it can be found the time of the way which was taken.
tp t td ð10Þ
Founding the number b, could be difficult to found when it is had the desired
number very big, nevertheless, you have to pay attention to the limits time and the
inequation behavior when the number n is repetitive for the desired number, it means
that the limit superior time is near to a middle or maximum time of the process. For
example, an easy and clear example is taking a ¼ 16:2, then:
2n
16:2 limts !b 16:2 b ð11Þ
ts

25
Therefore n has to be equal to 5, due to 16:2\25 , and b 16:2 ¼ 1:9753. . ., but
b can have the value 2, because the in equation is fulfilled with a b ¼ 2, then:
25
16:2 limts !2 14:2 ð12Þ
ts

tp
As b ¼ td and n ¼ 5, it can be intuited that the limit inferior time is 50 ms, then:
t
p
2¼ V tp ¼ 100 ms ð13Þ
50 ms
Therefore, the road can be found for when n ¼ 5 with a 32 number of ways and a
time between 50 ms and 100 ms, in Fig. 2 shows how the road is built.
Fig. 2. Algorithm development for when a = 16.2, the red roads means such ways are not
selected and are refused.
P2
The result is ½1:8ð5Þ þ n¼1 1:8n þ 1:8ð1Þ ¼ 16:2 in a time of 90 ms.
4 Matlab Simulation
It has been selected Matlab [22] for controlling the anthropomorphic robotic arm using
the algorithm and mixing the two discrete functions presented before. It has been
programmed a graphical user interface (GUI) (see Fig. 3), where the user can put the
degrees on Edit text or Slider, then the user has to select the push button, which is name
is “Get Position”. Then, when positions are calculated by the program, it is shown a
representation of a robotic arm with three rotation joins doing the kinematic. Moreover,
there are shown the calculus of way numbers, limit superior time, limit superior time
without waiting time of each pulse, limit inferior time, degrees taking the limit superior
time and inferior time.
The kinematic is inspired in the cubic polynomials, in this case, degrees in limit
inferior time are calculated using the cubic polynomials and taking the limit inferior
time to approximate the degrees which can be if it is decided to take them (in this case,
the desired degrees are 45). In summary, the program calculates the degrees which
correspond to limit superior time and limit inferior time with cubic polynomials method
to compare them (the degrees for alpha join) by the kinematics obtained. Looking at
Fig. 3, using cubic polynomials method implementing limit superior time, it is obtained
54° and 41.9126° using limit inferior time, on another hand in the kinematic, it is
obtained 0.7854 radians (it equals 45°) using the limit superior time and the algorithm.
The ideas and behavior of the algorithm implemented in a robotic arm are clearer in a
simulation, due to it gives us a better context to design the mechanical structures and
how must them work, moreover, it can be appreciated the velocity of the movements
and of course, it is proved the great functioning of the algorithm as a result.
Fig. 3. Matlab simulation to represent the algorithm in an example of a 45° of trajectory and the
different characteristics that the interface shows. In the lower-left corner (command window), it
shows the final position q10 ¼ ½0:78541:9333 0:2657 is represented on radians.
4.1 Mechanism Design

In this section, it is mentioned the design of the base or a joint. The a motor always
works each time the LCD touch screen is touched and then, the a motor always gets to 0°
position, it starts to work in 0° position and change the sense depending on the position
where the a motor has been. The first part of the mechanism is assembled in the axis of
the stepper motor, for then move two vertical axes and generate the trajectory of the first
joint. This piece (see Fig. 4(a)) has two sense, negative and positive sense, for gener-
ating a trajectory for both sides, the positive-sense which is a movement against the
hands of the watch and negative-sense is the movement towards hands of the watch, this
is the way that the a motor reach all of the position and get again to the 0-degree
position. Figure 4(b), shows the piece that moves another piece that has two vertical
axes and generates the torque. This piece is moved and remained in that position, while
the a motor and the first piece which was shown move until getting 0-degree position.
Finally, the final assembly which can be done possible that the algorithm can perform
and get to the position of a given time, it is shown below in Fig. 4(c).
Fig. 4. Representation of the pieces which are assembled in the stepper motor, the structure that
allows trajectory and movements to first joint.
5 Results
The different tests and results are focused on the Matlab simulation and the Arduino
code using the LCD touch screen and the stepper motors, which are presented then as
the desired value of a and the different results using the algorithm for Arduino code and
Matlab. The following results were obtained touching the LCD touch screen for when
the first condition is presented. Table 2 shows the results.
Table 2. The number of pulses that f ðsÞ and f ðnÞ need to approach a in the first condition.
y
a n f ðnÞ ¼ 1:8n s P
s
x f ðsÞ ¼ 1:8n
n¼0
5.71 0.1 3 5.4 2 5.4
11.3 0.2 6 10.8 3 10.8
16.69 0.3 9 16.2 3 10.8
18 0.35 12 21.6 4 18
21.8 0.4 12 21.6 4 18
26.56 0.5 14 25.2 4 18
27 0.5095 14 25.2 5 27
30.96 0.6 14 25.2 5 27
34.99 0.7 17 30.6 5 27
37.8 0.7756 19 34.2 6 38.7
38.65 0.8 21 37.8 6 38.7
41.98 0.9 23 41.4 6 38.7
45 1 25 45 6 38.7
The functions which were the best options to programmed for this condition and
were used and the new function is:
8 Ps y y
< n¼0 1:8n if 0 x 0:35; whit sy¼ f1; 2; 3; 4g or if 0:51
> x 0:79; s ¼ f5; 6g
f ðs; nÞ :¼ 1:8n if 0:4 x 0:5 orif 0:6 x 0:7
y
>
: 1:8n if 0:8 x 1
y
ð14Þ
Table 3, shows the results of the functions in the second condition.
Table 3. The number of pulses that f ðsÞ and f ðnÞ need to approach a in the second condition.
y
a n f ðnÞ ¼ 1:8n s P
s
x f ðsÞ ¼ 1:8n
n¼0
45 1 25 45 8 64.8
63.434 2 35 63 8 64.8
75.963 4 42 75.6 8 64.8
80.53 6 44 79.2 8 64.8
81 6.313 45 81 9 81
82.87 8 46 82.8 9 81
84.298 10 47 84.6 9 81
87.137 20 48 86.4 9 81
88.56 40 49 88.2 9 81
88.85 50 49 88.2 9 81
89.427 100 49 88.2 9 81
89.93 900 49 88.2 9 81
Then, the combination of the functions is represented as follows:

8
< P 1:8n if 1 yx 1:99 y
s
f ðn; sÞ :¼ n¼0 1:8n if2 x 2:2 or if 6 x 8
y
ð15Þ
:
1:8n if 2:3 yx 5:9 or if 8:1 yx 1000
The results of the third condition, are shown in Table 4:

Table 4. The number of pulses that f ðsÞ, f ðnÞ, and f ðs; nÞ need to approach a in the third
condition.
y
a x
n f ðnÞ ¼ 1:8n ðs; nÞ f ðs; nÞ ¼ f ðsÞ þ f ðnÞ
91.145 −50 50 90 (9, 0) 81
91.6 −30 51 91.8 (9, 0) 81
92.29 −25 51 91.8 (9, 0) 81
92.86 −20 52 93.6 (9, 0) 81
93.814 −15 52 93.6 (9, 0) 81
95.71 −10 53 95.4 (9, 0) 81
96.34 −9 54 97.2 (10, 0) 99
97.125 −8 54 97.2 (10, 0) 99
98.13 −7 55 99 (10, 0) 99
99 −6.313 55 99 (10,0) 99
99.462 −6 56 100.8 (10, 1) 100.8
100.8 −5.242 56 100.8 (10, 1) 100.8
101.309 −5 56 100.8 (10, 1) 100.8
102.6 −4.0473 57 102.6 (10, 2) 102.6
104.036 −4 58 104.4 (10, 3) 104.4
108 −3.0077 60 108 (10, 5) 108
108.434 −3 60 108 (10, 5) 108
115.2 −2.125 64 115.2 (10, 9) 115.2
116.565 −2 65 117 (10, 10) 117
133.2 −1.064 74 133.2 (10, 19) 133.2
135 −1 75 135 (10, 20) 135
Finally, the results of the last condition, are shown in Table 5.
Table 5. The number of pulses that f ðsÞ and f ðnÞ, and f ðs; nÞ need to approach a in the fourth
condition.
y
a tan1 x þ 180 n f ðnÞ ¼ 1:8n ðs; nÞ f ðs; nÞ ¼ f ðsÞ þ f ðnÞ
−0.9999 135.01 75 135 (10, 20) 135
−0.99 135.28 75 135 (10, 20) 135
−0.9 138.012 77 138.6 (10, 20) 135
−0.85 139.012 78 140.4 (10, 20) 135
−0.8 141.34 79 142.2 (10, 20) 135
−0.75 143.036 80 144 (10, 20) 135
−0.7 145.007 81 145.8 (10, 20) 135
−0.6 149.036 93 149.4 (10, 20) 135
−0.5 153.43 85 153 (10, 20) 135
−0.4 158.198 87 156.6 (10, 20) 135
−0.3 163.3 90 162 (10, 20) 135
−0.2905 163.8 91 163.8 (13, 0) 168.8
−0.2 168.69 94 169.2 (13, 3) 169.2
−0.158 171 95 171 (13, 4) 171
−0.1 174.28 97 174.6 (13, 6) 174.6
−0.031 178.2 99 178.2 (13, 8) 178.2
−1 180 100 180 (13, 9) 180
The results of the function which is mixed with f ðsÞ and f ðnÞ, is very precise and
faster than f ðsÞ too. Also, in this case, f ðs; nÞ stays in 135 whiles ðy=xÞ is less than –
0.2905. Figure 5 shows the results of each function and the approximate of function
f ðnÞ and f ðsÞ, it can be observed how the combinations can be built and which can be
taken. Function result of the fourth condition
8 y
< Ps 1:8n if 0:999 x \ y 0:3
f ðs; nÞ :¼ P 1:8n if 0:29
y x 0:25
: s n¼0
0
n¼0 1:8n þ 1:8n if 0:25\ x 1 with s ¼ 13 y 0\n0 9
ð16Þ
Fig. 5. Results representation of the kinematics for each condition on the Matlab and Arduino
simulation.
6 Conclusions
In this paper, it has been presented and simulated an algorithm to control a robotic arm
with an LCD touch screen, which has been introduced through the length of this paper.
How is presented, it is a two-dimension plane and the desired touchpoint which is
located on the LCD touch screen, it can be represented as a robot kinematic in a three-
dimensions plane and utilize it for physically disabled people. Moreover, it was tested
the functioning of this algorithm and the combinations to program and create a faster
and precise touch control, building all the combinations which can be possible.
The algorithm works and it can be programmed such as was presented in this paper
and even more important, it can be used to help physically disabled people in an
intuitive, easy, useful with very few efforts because it has been proved that the control
can be controlled by only one finger and that is what we wanted to achieve. Basically,
this paper talks about the beginning of a new shape to create a kinematic robot arm
using a two-dimension control and help physically disabled people. Therefore, it is had
many future works to can improve this project, one time that be found the way to build
this algorithm using one general equation, it will be easier to use in different pro-
gramming languages, such as in this paper Matlab and Arduino were used, and helping
people to have better and faster tools for the quotidian life.
References
1. Grau, A., Indri, M., Bello, L.L., Sauter, T.: Industrial robotics in factory automation: from
the early stage to the Internet of Things. In: 43rd Annual Conference of the IEEE Industrial
Electronics Society, pp. 6159–6164. IEEE Press, Beijing (2017)
2. Yenorkar, R., Chaskar, U.M.: GUI based pick and place robotic arm for multipurpose
industrial applications. In: Second International Conference on Intelligent Computing and
Control Systems, Madurai, India, pp. 200–203 (2018)
3. Burgner-Kahrs, J., Rucker, D.C., Choset, H.: Continuum robots for medical applications: a
survey. IEEE Trans. Rob. 31(6), 1261–1280 (2015)
4. Murali, A., Sen, S., Kehoe, B., Garg, A., Mcfarland, S., Patil, S., Boyd, W.D., Lim, S.,
Abbeel, P., Goldberg, K.: Learning by observation for surgical subtasks: multilateral cutting
of 3D viscoelastic and 2D Orthotropic Tissue Phantoms. In: IEEE International Conference
on Robotics and Automation, pp. 1202–1209. IEEE Press, Seattle (2015)
5. Shademan, A., Decker, R.S., Opfermann, J.D., Leonard, S., Krieger, A., Kim, P.C.:
Supervised autonomous robotic soft tissue surgery. Sci. Trans. Med. 8(337), 337ra64 (2016)
6. Allen, S.: New prostheses and orthoses step up their game: motorized knees, robotic hands,
and exosuits mark advances in rehabilitation technology. IEEE Pulse 7(3), 6–11 (2016)
7. Niyetkaliyev, A.S., Hussain, S., Ghayesh, M.H., Alici, G.: Review on design and control
aspects of robotic shoulder rehabilitation orthoses. IEEE Trans. Hum. Mach. Sys. 47(6),
1134–1145 (2017)
8. Proietti, T., Crocher, V., Roby-Brami, A., Jarrassé, N.: Upper-limb robotic exoskeletons for
neurorehabilitation: a review on control strategies. IEEE Rev. Biom. Eng. 9, 4–14 (2016)
9. Rehmat, N., Zuo, J., Meng, W., Liu, Q., Xie, S.Q., Liang, H.: Upper limb rehabilitation using
robotic exoskeleton systems: a systematic review. Int. J. Int. Rob. App. 2(3), 283–295 (2018)
10. Young, A.J., Ferris, D.P.: State of the art and future directions for lower limb robotic
exoskeletons. IEEE Trans. Neu. Syst. Reh. Eng. 25(2), 171–182 (2017)
11. Makin, T., de Vignemont, F., Faisal, A.: Neurocognitive barriers to the embodiment of
technology. Nat. Biom. Eng. 1(0014), 1–3 (2017)
12. Beckerle, P., Kõiva, R., Kirchner, E.A., Bekrater-Bodmann, R., Dosen, S., Christ, O.,
Abbink, D.A., Castellini, C., Lenggenhager, B.: Feel-good robotics: requirements on touch
for embodiment in assistive robotics. Front. Neurorobot. 12, 1–84 (2018)
13. Jiang, H., Wachs, J.P., Duerstock, B.S.: Facilitated gesture recognition based interfaces for
people with upper extremity physical impairments. In: Alvarez, L., Mejail, M., Gomez, L.,
Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 228–235. Springer, Heidelberg (2012)
14. Kruthika, K., Kumar, B.M.K., Lakshminarayanan, S.: Design and development of a robotic
arm. In: IEEE International Conference on Circuits, Controls, Communications and
Computing, pp. 1–4. IEEE Press, Bangalore (2016)
15. Chung, C.S., Wang, H., Cooper, R.A.: Functional assessment and performance evaluation for
assistive robotic manipulators: Literature review. J. Spinal Cord Med. 36(4), 273–289 (2013)
16. Perez-Marcos, D., Chevalley, O., Schmidlin, T., Garipelli, G., Serino, A., Vuadens, P., Tadi,
T., Blanke, O., Millán, J.D.: Increasing upper limb training intensity in chronic stroke using
embodied virtual reality: a pilot study. J. Neu. Eng. Reh. 14(1), 119 (2017)
17. Levin, M.F., Weiss, P.L., Keshner, E.A.: Emergence of virtual reality as a tool for upper
limb rehabilitation: incorporation of motor control and motor learning principles. Phy. Ther.
95(3), 415–425 (2015)
18. Kokkinara, E., Slater, M., López-Moliner, J.: The effects of visuomotor calibration to the
perceived space and body through embodiment in immersive virtual reality. ACM Trans.
Appl. Percept. 13(1), 1–22 (2015)
19. Bovet, S., Debarba, H.G., Herbelin, B., Molla, E., Boulic, R.: The critical role of self-contact
for embodiment in virtual reality. IEEE Trans. Vis. Comput. Graph. 24(4), 1428–1436 (2018)
20. Atre, P., Bhagat, S., Pooniwala, N., Shah, P.: Efficient and feasible gesture controlled robotic
arm. In: IEEE Second International Conference on Intelligent Computing and Control
Systems, pp. 1–6. IEEE Press, Madurai (2018)
21. Badrinath, A.S., Vinay, P.B., Hegde, P.: Computer vision based semi-intuitive Robotic arm.
In: IEEE 2nd International Conference on Advances in Electrical, Electronics, Information,
Communication and Bio-Informatics, pp. 563–567. IEEE Press, Chennai (2016)
22. Matlab & Simulink. https://www.mathworks.com. Accessed 15 Oct 2019
Information Technologies in Education
Performance Indicator Based on Learning
Routes: Second Round
Franklin Chamba1, Susana Arias2, Gustavo Alvarez3,

and Héctor Gómez4(&)
1
Facultad de Ciencias Sociales, Universidad Técnica de Machala,
Machala, Ecuador
[email protected]
2
Facultad de Ciencias de la Salud, Universidad Técnica de Ambato,
Ambato, Ecuador
[email protected]
3
Universidad Regional Autónoma de los Andes, Turismo,
Km 5 1/2 vía a Baños, Ambato, Ecuador
[email protected]
4
Universidad Técnica de Ambato, Ambato, Ecuador
[email protected]
Abstract. Social human behavior which is not always clearly announced or

defined by consciousness, this emphasizes the tendency to self-protection of the
I or the SAME, of one’s. Currently the human behavior manifested in the
actions, reactions, because it develops through the framework, family, social and
educational, where ancestral values. During the investigation, it has been con-
sidered what needs to change the mentality of human behavior in teaching
students, transform their method, their thinking in a pedagogical way, using
necessary face-to-face resources that facilitate the teacher to make an interactive
class, attacking the different styles of learning that exist within a classroom and
therefore covering some special educational needs that may or may exist within
the classroom. The Promethean ClassFlow teacher learning platform allows you
to plan classes more efficiently and make the classes a more interesting expe-
rience for your students.
Keywords: Learning Routes Teaching
1 Introduction
Currently the human behavior manifested in the actions, reactions, because it develops
through the framework, family, social and educational, where ancestral values,
knowledge, skills, aptitudes and customs are acquired. However, the technological
globalization presented by education and learning methods has not had a significant
advance, the tradition is still used even when there are tools to innovate in the teaching
process, avoiding the poor performance of learning in students (Sorokin and Sotomayor
2016). In the province of ORO, due to the complexity of teaching practices and
pedagogy, innovation and the search for new ways are needed to boost the teaching
chair with teaching resources according to the scope, culture and idiosyncrasy of
138 F. Chamba et al.
different societies, taking into account this perspective, the teaching practice in the
development of competences framed in the ability to design permanent and significant
learning experiences, in which students are the central point of the teaching-learning
process through the correct use of ICT and towards a digital culture to face the new
challenges. The CEPWOL Altamira Individual Educational Unit of the city of Santa
Rosa, El Oro province, with teachers of Basic General Education, shows that there is
disinterest in learning in students, presenting the need to innovate in the teaching-
learning process in routes of learning that improve academic performance. Faced with
the difficulties in treating social human behavior, this research is framed, in learning
paths to improve performance, ensuring that the topics that have the greatest difficulty
such as dyslexia, dyscalculia, hyperactivity, are taught technologically and achieve
meaningful.
2 Art State
Lacunza (2015) in her article Social skills and child prosocial behavior from positive
psychology when analyzing the results concludes that: “The identification of skills
favors socialization with peers enables the promotion of salugénic social skills and
prosocial behaviors, basic resources for the positive development of the child” (p. 6).
(Yépez 2010) in his job the absence and its influence on the Learning path of the first
year of basic education at the Alberto Albán Villamarín school in the community of
Noetanda states the following: It concludes that the children mostly do not match the
tasks received in the learning path when they have failed, in addition these children
who are missing (Zambrano-Ríos 2017) classes find it difficult to discuss the topics
covered in the route learning (p. 55). In his research report on the Factors in the socio-
affective behavior of children aged 3 to 4, he states the following: It concludes that
family problems are stressors that influence the development of the child causing
complications in the control of sphincters, as well as the image of the child, aspects that
alter the child’s behavior socially and affectively, also altering family well-being
(p. 56). (Jácome Mayorga 2017) in his thesis report Learning routes and communi-
cation skills in children in fourth and fifth years of basic education states the following:
It concludes that the lack of institutional self-management does not allow teachers to be
trained in a continuous and systematic way on these topics of innovative learning
strategies, so they are only directed by the guidelines given by the Ministry of Edu-
cation, but it is also the lack of interest of each of the teachers to innovate, so that the
learning outcomes are not ideal, teachers have a profile that is not competitive at the
local or provincial level, because they present many gaps throughout their school
education (p. 142). The authors (Galarsi et al. 2011) in the article Behavior, history and
evolution conclude that: For most of human history man has considered himself as a
superior being completely different from animals. But if you consider the contributions
of Charles Darwin, who suggested that throughout the evolution we have maintained
blood ties with other species. Today, a century later, this relationship is admitted by
many thinkers. The information society requires that students learn through the use of
ICTs, and in turn be protagonists of their learning, as the human being performs his
activities consciously and unconsciously, it is intended that the behavior be related to
Performance Indicator Based on Learning Routes: Second Round 139
knowledge, where the student develops his future through a critical paradigm because
the educational reality seeks solutions to the problem between learning and social
human behavior that is currently being lost, values and personality. The educational
processes in the different studies show that the tools offered by ICTs can help in an
effective and effective way to find better learning paths for the construction of
knowledge, objective experiences, transforming the student’s thinking for a better
human behavior in the current society. Education today is facing specific and cultural
problems, which basically refer to the need for the use of the most modern computer
technologies in order to meet quality standards and promote a digital culture that
represents the “Information Age”. And that has a decisive impact on the specific
objectives of current Education.
3 Methodology
Classflow is a collaborative platform (virtual-cloud), which increases the motivation of

students in the interactive and collaborative use that can be used from any device
available to the student, to work inside or outside the classroom, as well as the
classroom curriculum in digital and the different tasks, activities and evaluations
(Instant screen, questionnaire, creation of lessons, create activities, create evaluations),
that the teacher wants to exercise in the teaching of mathematics or any other subject,
whether university or secondary. Generally the student is easily bored of the subjects,
but manipulates an object that allows him to maintain his five senses to what he learns,
his behavior can vary to a negative attitude such as hyperactivity, learning deficit,
perform other activities that do not have to do with those of academic learning.
Therefore, Classflow meets and facilitates the learning of mathematics, to a social,
cooperative and collaborative behavior. To begin with the development of the class,
enter the instant screen, which automatically displays an access code for students to
enter. In addition, the menu bar to use the class design tools is displayed at the bottom.
During the activity, interaction there was an improvement in social human behavior in
the activity, all students are aware of what they are learning and share the activity in
development with each other. During the verification process the students noticed great
improvement in their learning and mainly took a distinctly different attitude than, at the
beginning with traditionalist education. Instead, ClassFlow influences the learning
process that allows the student to focus on the academic activities of the math course.
4 Experimentation
Research is part of human behavior in general and therefore knowledge has been
defined as a process in which a cognitive subject (who knows) is related to an object of
knowledge (that which is known) which results in a product mental new, called
knowledge. Thus, the same term designates the process and the result of said process;
that is, we call the subjective operation that produces it, as well as the product itself,
knowledge (Rodríguez 2011). It affects social human behavior in education as an
indicator of performance based on learning routes for students of Basic General
140 F. Chamba et al.
Education of the CEPWOL ALTAMIRA Private Education Unit of the city of Santa
Rosa in the province of El Oro. The modality, on the other hand, allows an approach to
the problem of study, but with the actors of the educational community, through the
collection of information of knowledge, experiences and information that parents,
teachers have authorities on the communication strategies executed, evaluating through
of quantified indicators, and analysis with the participation of the study group (see
Fig. 1). Research conducted on a sample of subject’s representative of a larger pop-
ulation, which is carried out in the context of daily life, using standardized interrogation
procedures, in order to obtain quantitative measurements of a wide variety of objective
and subjective characteristics of the population. The survey is aimed at students,
authorities and teachers of the CEPWOL Altamira Individual Educational Unit, whose
instrument is the questionnaire, prepared with closed questions that allow the study
variable to be collected. The questionnaire allowed to gather information with open and
closed questions established beforehand, they are always posed in the same order and
formulated previously preparing and strictly standardized. The research instruments
were subject to criteria of validity and reliability. The validity is given through the
technique “expert judgments”; while the reliability was carried out through the appli-
cation of a pilot test to a group of students and teachers with characteristics similar to
the established sample, allowing to detect errors in the understanding of the questions
and the selection of answers to correct before your application. The internal consistency
method based on Cronbach’s alpha, to estimate the reliability of the measuring
instrument through a set of items that will be measured in the same theoretical
dimension by applying the SPSS software.
Final Test
Percent
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
Low 1 2.4 3.3 0 4.1 4.9 0 0 9.8 13.8 0 100
Nothing 2 0.8 0 0 0 0 0 4.1 0.8 0 0 100
Normal 3 15.9 7.3 18.7 13 17.1 6.5 7.3 14.6 13 0 100
High 4 80.5 89.4 81.7 82.9 78 93.5 88.6 74.8 73.2 100 100
Fig. 1. Histogram 1 test results

Performance Indicator Based on Learning Routes: Second Round 141
In the verification survey (Histrogram) it can be seen that significant scores favor
the use of ClassFlow and the results of application, 80.5% changes favor learning, a
89.4% change the social human behavior of Students positively, 81.70%. The exchange
of skills and strengths improves the academic performance of students, in a social,
academic and cultural way.
5 Conclusions
Significant scores favor the use of ClassFlow and the results of application, 80.5%
changes favor learning, a 89.4% change in the social human behavior of students in a
positive way, 81.70% exchange of skills and strengths improves students’ academic
performance, in a social, academic and cultural way. 82.90% consider that the
ClassFlow tool contributes with the learning competencies in students, 93.50%, con-
tributes to the positive improvement of social human behavior in students, 88.60% of
students and teachers consider that ClassFlow that the use of this resource helps to
learn.
References
Lacunza, A.B.: Las habilidades sociales y el comportamiento prosocial infantil desde la
psicología positiva. Pequén, 12, 20. Recuperado el 26 de 12 de 2017, de (2015). http://
revistas.ubiobio.cl/index.php/RP/article/view/1831
Zambrano-Ríos, M.F.: Factores en el comportamiento socioafectivo de los niños de 3 a 4 años.
Informe de Investigación, Universidad Técnica Ambato, Facultad de Ciencias Humanas y de
la Educación, Ambato. Recuperado el 26 de 12 de 2017, de (2017). http://repositorio.uta.edu.
ec/jspui/handle/123456789/25880
Jácome Mayorga, M.E.: Las rutas de Aprendizaje y las habiliades de comunicación en los niños
de cuarto y quinto año de educación básica. Tesis, Universidad Técnica Ambato, Facultad de
Ciencias Humanas y de la Educación- Psicología Educativa, Ambato. Recuperado el 29 de 12
de 2017, de (2017). http://repositorio.uta.edu.ec/jspui/handle/123456789/26062
Galarsi, M.F., Medina, A., Ledezma, C.: Comportamiento, historia y evolución. Redalyc (24),
36, Recuperado el 26 de 11 de 2017, de (2011). http://www.redalyc.org/pdf/184/184269
20003.pdf
Rodríguez, J.M.: Métodos de investigación cualitativa. Revista de Investigación Silogismo 1(08),
108 (2011)
Yépez, A.A.: La inasistencia y su influencia en la ruta del Aprendizaje del primer año de
educación básica en la escuela Alberto Albán Villamarín de la comunidad de noetanda. Tesis,
Universidad Técnica de Ambato, Facultad de Ciencias Humanas y de la Educación, Ambato.
Recuperado el 21 de 12 de 2017, de (2010). http://repositorio.uta.edu.ec/handle/123456789/
3875
Sorokin, P., Sotomayor, M.A.: Ciencias Sociales Humanas y de Comportamiento: Dificulatades
regularias en países latinoamericanos. Revista de la Facultad de Ciencias Humanas de la
Universidad Autonoma de Colombia 14(1), 23 (2016)
Evaluating the Acceptance
of Blended-Learning Tools: A Case Study
Using SlideWiki Presentation Rooms
Anne Martin1 , Bianca Bergande2(B) , and Roy Meissner1

1
Faculty for Education, Leipzig University, Leipzig, Germany
{anne.martin,roy.meissner}@uni-leipzig.de
2
Faculty of Information Management, Neu-Ulm University of Applied Sciences,
Neu-Ulm, Germany
[email protected]
Abstract. As e-learning formats become increasingly important, tech-

nological assistance becomes a critical factor for its implementation. The
tool SlideWiki Presentation Rooms tries to combine the needs and func-
tions of live interaction with the advantages of e-learning. To evaluate
the acceptance of the tool a teaching experiment was conducted at an
University of Applied Sciences. The results show, that the actual usage
of a new blended-learning tool is more determined by the perceived use-
fulness than the ease of use and implicates further influence factors on
students acceptance towards SlideWiki. They provide valuable insights
for the successful implementation of blended-learning tools in higher edu-
cation and need to be further examined in future studies.
Keywords: Higher education · Blended-learning · Teaching

experiment · Slidewiki Presentation Rooms · Case study · Technology
acceptance · E-learning
1 Introduction
As e-learning environments and blended-learning formats become more and more

prominent, technological assistance and its usability has been shown to be an
increasingly critical factor for its implementation. The effect of blended-learning
scenarios on students has been researched in terms of their involvement, self-
discipline and overall perceived effects on their learning experience regarding
risks and benefits [1]. These include the ability to socialize, to directly interact
with the lecturer, to operate technologies and to handle the tasks of self-directed
learning [2]. One blended-learning tool in development is SlideWiki, including
SlideWiki Presentations Rooms [3,4]. It tries to close the gap between online-
only and presence-only teaching by combining the needs and functions of live
interaction with the advantages of e-learning such as self-directed learning and
Evaluating the Acceptance of Blended-Learning Tools 143
chatting, as well as adaption to the learners individual needs [3]. These functions
and possibilities aim to improve the perceived quality of the learning experience
of its users [3]. However, the actual effect on the learning experience of its users
has not been tested so far. To evaluate the acceptance and usage of the tool a
teaching experiment was conducted at an University of Applied Sciences.
The content of the paper is structured as follows: in Sect. 2, the tool and the
theoretical context are presented. These are used to explain the methods and
data collection in Sect. 3 followed by the results in Sect. 4. The study concludes
with a summary in Sect. 5 and showcases further potential for research addressing
this topic.
2 Related Work
At first SlideWiki Presentation Rooms is described, which was used as a basis

for this study. Afterwards possible assumptions on a blended-learning setting are
reduced to a fitting set. Lastly technological facts and theoretical assumptions
are combined to conclude with the main research question.
2.1 SlideWiki & SlideWiki Presentation Rooms
SlideWiki is a free, publicly accessible, and open-source courseware authoring

platform for the collaborative development of public and reusable slide based
learning content (OER) [5]. The platform is based on a crowdsourcing approach
(cf. Wikipedia, OpenStreetMaps, Github) and can be classified as a Social Learn-
ing Environment (SLE). Alongside providing support and feedback mechanisms
of traditional learning environments, it also provides the possibility to collabo-
rate and communicate with teachers and learners, establishing a simultaneous
network [6]. SlideWiki offers various components that allow social interaction
with other participants, like a comment section, an embedded twitter option,
as well as interaction with content based on one’s own knowledge and learning
interests [4].
One aspect of SlideWiki is a function called Presentation Rooms (PR). In
the structural sense, PR can be classified as a webinar solution and was first pre-
sented by Meissner et al. [3]. It allows interactive participation of an audience
in blended-learning or e-learning environments. In PR, participants are able to
directly interact with the displayed slides by e.g. changing slides on an individual
basis (independent from the lecturer) and by interacting with slide content, e.g.
by operating an included third party application. Furthermore, it allows partic-
ipants to choose a preferred language of the displayed slides and to receive a
live-transcript of the lecturers voice [3]. These are considered novel features to
both lecturers and participants, as they allow interactive and individual ways of
interacting with the slide-based material. Meissner et al. introduced the possibil-
ities of PR comprehensively in their publication [3], which is why the following
explanation focuses on a possible categorization of the application.
144 A. Martin et al.
2.2 Assumptions on Knowledge, Learning and Learners

in Blended-Learning Settings
In addition to the categorization in Subsect. 2.1, PR is construed as a digital

application that is suitable for the format of blended-learning because its interac-
tive character meets the demands of self-direction as of the perception of learners.
This requires digital competence for dealing with the numerous technical possi-
bilities of e-learning solutions, used for the creation, preparation, and sharing of
knowledge content, which is seen as an important feature in the learning theory
of connectivism [7]. Connectivism can be understood as a theory of learning in
the digital age. It arose from Siemens and stems from the changes applied to
learning using digital technologies with regard to diverse learning paths along
the life span, diverse learning objects, the growing importance of informal learn-
ing, informal knowledge management, and much more [8]. In contrast to classical
learning theories such as behaviorism, cognitivism and constructivism, connec-
tivism refers to open and decentralized learning opportunities and thus also
covers the area of informal learning. The main focus of this theory is the learner,
who controls the learning process, which is recognized as a networking process
[2]. SlideWiki PR can be examined with the theory of connectivism as the func-
tions of PR support the same understanding of the learning process and the role
of learners in a blended-learning setting, which are:
– Learners are seen as individuals who actively produce knowledge in learn-

ing. This leads to a complementary change in the role of the teacher from a
knowledge mediator towards a learning companion [9].
– Learning outcomes become visible and comprehensible in the form of concrete
knowledge contributions in forums, blogs, etc., which promotes discussion of
the content and creates a higher level of commitment.
– The active role of learners according to their own learning needs is promoted
under the paradigm of active and self-determined learning [8].
Thus, the focus is on the individual with his or her learning needs and learning
activities, which entails or necessitates a constructivist understanding of learning
[7]. Yet, a purely constructivist perspective on learning with digital media is no
longer sufficient. Learning is no longer viewed from the perspective of information
processing (as in the paradigm of cognitivism), but is focused by the dominance
of the learning object in virtual space with simultaneous networking with other
learners (and also teachers) in a self-controlled way in adaption to one’s own
learning needs and in simultaneous connection with other users. Therefore, the
tools added values are identified by the connectivist theory. The question if the
benefit of these functions is also perceived by the users, who are likely used
to other instruction methods and tools such as Moodle, seminars and lectures.
How the users accept Slidewiki PR as a teaching environment must therefore be
explored.
2.3 How Is SlideWiki Presentation Rooms Perceived and Accepted

by the Target Group
The basis for the success of a new technology and its use is regarded as its accep-
tance [10]. Thus it is appropriate for the evaluation of PR as a new technology.
As PR is a webinar solution, it enables large numbers of people to access content
simultaneously and collaboratively. Innovative approaches are used which seize
several advantages and demands of learners and teachers. A proven model that
defines factors influencing the acceptance of new technologies is the Davis tech-
nology acceptance model (TAM), which has been extensively empirically tested
and widely approved to measure the acceptance of a new technology [11,12]. The
development of this model was based on the lack of valid predictors of influenc-
ing factors on technology acceptance [13]. As a result of the model development
in a first form two factors have been identified which allow a prediction of the
actual use of new technologies. These are Perceived Usefulness (PU) and Per-
ceived Ease of Use (PEOU). PU refers to the probability that the use of the
new technology increases the added value for performance in the occupational
field. PEOU refers to the ease of use of the technology under assessment [14].
The two factors are recognized in the TAM as a cognitive reaction to the use of
the technology and lead to an increased acceptance of the attitude (consisting
of the attitude towards the use and the intention to use) and ultimately to the
actual use. This means that people who have a positive attitude towards the use
of a technology are highly likely to actually use it [14]. Since the probability of
acceptance and usage of SlideWiki PR is a central concern of this study the TAM
scale was chosen as an appropriate instrument to examine the following research
question (RQ): How do users perceive the acceptance of SlideWiki Presentation
Rooms, based on the perceived usefulness and the perceived ease of use?
3 Methodology
A teaching experiment has been conducted in order to evaluate and assess the
use of SlideWiki PR.
3.1 Sample
The experiment was carried out in a 90-min lecture about artificial intelligence
as part of a business information systems class that was held in late June 2019.
A group of fifth semester undergraduate students in a degree featuring both
economics and computer science elements was chosen as a fitting sample for
this experiment, since they speak sufficient English as fifth semester students
who are mostly German natives and might be interested in a digitized teaching
environment including translation and are all ready familiar with other tools like
f.e. moodle.
3.2 Execution
The initially intended bring-your-own-device (BYOD) approach was abandoned

to provide a controlled environment for the experiment in an official computer
pool at the university with one distinct browser. The goal of these measures
were to maximize comparability, minimize possible technical issues and to focus
on the different functions of the tool in use, even though it supports the BYOD
approach in general [3]. A week before the experiment was conducted, the stu-
dents were informed about the experiment and the use of SlideWiki PR via
Moodle1 . They received a short introduction document, that illustrated how to
enter the Presentation Rooms session, choose a preferred language, and use the
chat. Furthermore, a brief oral introduction into the experiment and tool were
given at the beginning of the lecture, which consisted of the same information as
the introduction document. These measures were taken to make sure all students
had the same knowledge about the tool and were able to use it and evaluate it
correctly. The questionnaire featured an approved translation of the TAM from
a German textbook [15] and contained the following questions with a 7-point
likert-scale and an additional open question (=O1) [16]:
– I1. I would definitely continue to use SlideWiki if offered

– I2. I think others should also use SlideWiki
– I3. I plan to increase my use of SlideWiki over the coming year
– A1. I think it was a good idea to use SlideWiki for this task
– A2. It would be a lot better for me to use SlideWiki instead of manual methods
– A3. To use SlideWiki is a good idea
– P1. The use of SlideWiki allows me to complete tasks faster
– P2. The use of SlideWiki improved my productivity in completing tasks
– P3. The use of SlideWiki can increase my productivity in learning
– E1. My interaction with SlideWiki is clear and understandable
– E2. I find SlideWiki easy to use
– E3. It is easy to learn how to operate SlideWiki
– O1. What did you like best about SlideWiki? Please give a full sentence and
explain your choice briefly! (Open Question)
A factor analysis was completed for the original questionnaire, as well as

determining reliability measures and conducting a validity evaluation that is the
basis for the reliability of these results [16]. The questionnaires were handed out
after the lecture, collected and scanned using the EvaSys evaluation system2 .
3.3 Analytical Methods
As the number of participants was 41, the mean values were chosen as the central
criterion and the median and the standard deviation of tendencies towards the
1
Moodle is a Learning Management System, see: https://moodle.de/.
2
EvaSys evaluation system: https://en.evasys.de/main/add-ons-services/system-
security-performance.html.
mean were taken into account as critical values. Tendencies towards positive (“I
totally agree”) and negative (“I fully disagree”) statements were detected by the
summation of percentages in two, or sometimes three, distinctive subcategories
with the median as a critical limit [17]. The conclusions derived from these
figures were supplemented by the information provided by the students in the
open question section, which was adopted uncoded.
4 Results
A total number of 41 participants answered the questions from Subsect. 3.2.
Usually a sample that is larger than 30 participants is recognized to lead to
representative results, which can be transferred to a more general population
[18]. In general, the students answered in favour of the tool and liked the idea of
the independent slide use as well as the chat and the simple usage. Despite an
announcement, a written introduction document and a brief oral introduction
before the experiment, the intended benefit of the tool was not clear for all of
the participants, maybe because it interrupted with their habitual work flow, as
the open answer section indicates. Table 1 provides an aggregated overview of
the collected results.
Perceived Usefulness: The vast majority disagrees with the statement that
the tool allows them to complete tasks faster. The students are uncertain about
the improvement of their productivity in task completion with 30,8% who either
slightly disagree and 30,8% who are uncertain. However, the slightly negative
tendency is confirmed in the following question about the perceived increase in
productivity in learning with 47,5% who disagree, 40% who are neutral and only
12,5% who agree.
Ease of Use: The perceived ease of use is the most positive aspect of this
evaluation. A majority of 72,5% of the students found the interaction with the
tool easy and understandable and even 80% agree that SlideWiki is easy to use
with 50% who even fully agree. A majority of 87,5% of the students also found
that the use of SlideWiki is easy to learn.
Attitude Towards Use: Question A1 divides the sample into two separate
parties with a small majority who either fully agrees or disagrees about the
statement, that the use of SlideWiki for this task is a good idea. When it comes
to the preferred mode of methods the students would clearly prefer manual
methods over SlideWiki. Yet the students think that the general usage of the
tool is a good idea, as 67,50 % agree.
Intention to Use: The general acceptance of the tool is slightly positive with
a concentration of agreement to use the tool in the future if offered. When asked
if others should use SlideWiki as well the participants almost all neither agreed
nor disagreed which can be interpreted as a rather neutral position. The third
question of the item set provides a clearer picture, that the students do not feel
like they would increase the usage in the future.
Table 1. The statements represent the learners position on the questionnaire and
are numbered by their item group e.g. I = intention to use; A = attitude towards use;
P = perceived usefulness; E = ease of use. The results of the evaluation are based on a
7-point likert-scale, where 7 corresponds to “I totally agree” and 1 corresponds to “I
fully disagree”.
Statement Agree (%) Disagree (%) Average Median Deviation

I1. Continue to use it 53,9 46,1 4,3 4 2,1
I2. Others should also use it 56,4 43,5 4,1 4 1,9
I3. Plan to increase the use 20,5 79,5 3,5 3 1,7
in the coming year
A1. A good idea to use the 57,5 42,5 4,3 4 2,1
tool for this task
A2. Better to use it instead 32,5 67,5 3,6 4 1,9
of manual methods
A3. To use it is a good idea 67,5 32,5 4,4 4 2
P1. The use allows the user 22,5 77,5 3,3 3 1,6
to complete tasks faster
P2. It improves 12,5 47,5 or 87,5* 3,4 4 1,5
productivity in completing
tasks
P3. The use can increase 12,5 87,5 3,4 4 1,5
productivity in learning
E1. The interaction is clear 85 15 5,2 6 1,7
and understandable
E2. The tool is easy to use 80 20 5,8 6,5 1,7
E3. It is easy to learn how 87 13 5,6 6 1,7
to operate the tool
Open Question: Since the researchers had their own idea about the possible
benefit of the tool, the question what students liked best about SlideWiki Pre-
sentation Rooms was asked to involve their personal opinion on this subject
matter. While only 18 out of 41 students took the time to reply, their answers
reveal an interesting perspective, that is in line with the rest of the results. The
ease of use and opportunity to ask questions in the chat as well as the syn-
chronized flow of presentation slides were mentioned positively for two times
each. An interesting point of critique was that students were frustrated that
they could not take notes directly in the material, which probably stems from
their offline and PDF workflow. This indicates that the students have expected a
personal and persistent environment, which the tool is currently not providing.
Three participants mentioned that they were not satisfied that they could not
use their own device, which was a result of the individual network restrictions
at the University where this experiment was conducted (see Sect. 4.1). An unex-
pected result is the uncertainty detected about the actual benefit of the tool.
While one participant understood the value especially for online lectures five had
issues to see the added overall value of the tool. These statements provide an
indication on the mixed results above, where students tended to have an either
neutral or slightly negative attitude towards SlideWiki.
4.1 Limitations
SlideWiki PR was used once in a sample of 41 students in one lecture. Probably
a part of this sample did not read the introduction into the tool, shown by their
doubt in the sense of the tool overall, so that the reliability of their statements
and general judgment is disputable. Furthermore, the 7-point likert-scale allows
a tendency to central scores which can be seen in the results. This increases
the difficulty to shape valid conclusions from the data [17]. These factors com-
bined weaken the overall validity of the study. Therefore, the presented data
can be interpret as a single case study with limited significance. Nonetheless, it
provides an insight into students perception of and adaptation to e-learning envi-
ronments. Another aspect that needs to be critically considered is the abandon-
ment of the originally intended bring-your-own-device (BYOD) approach after a
pretest. This was decided because of technical issues caused by the restrictively
adjusted wireless network. Thus, the environment was replaced by a controlled
one in an official computer pool at the university. Students were asked to use a
certain browser for maximum comparability even though it supports the BYOD
approach and various browsers in general [3]. These adjustments represent a dif-
ference from the intended use and must thus be regarded as a limitation of this
study.
5 Summary and Future Work

A teaching experiment was conducted at a university of applied sciences with
a sample of 41 students. The participants answered a questionnaire after the
experiment, asking for perceived usefulness and ease of use, as well as their
opinion, in order to derive the acceptance of the examined tool. The open ques-
tion provided insights into the students opinions about the benefit of SlideWiki
PR and emphasize the perceived usefulness. As a result, they appreciated the
advantages associated with PR, like the possibility of asking questions and also
the simplicity to use the application, which corresponds to the perceived ease
of use. Furthermore, they attributed the tool potential, but also see it critically
at the same time. For example they enunciated the desire for taking personal
notes directly at the slides. Thus the main research question must be answered
with: the ease of use is regarded as significant, but the added value for the own
learning process is currently not recognized. The results implicate that a more
guided introduction into the tool and its intended use, as well as benefits are
important for its successful integration to the learning routine and acceptance
by the students as self-directed learners in a networked learning process.
5.1 Future Work

A higher number of participants should be used in future studies to make the
results more reliable and solid to deviations. Alongside, the results show uncer-
tainty about the benefit of the tool, which might be decreased by an step-by-
step introductory class. It would also be interesting to use the tool for a whole
term and evaluate the tool in the beginning and in the end, or compare it to the
results of the single-time use presented in this study. Another field of application
for future exploration are questions about the possible differences between user
groups such as students, professionals, different genders, or ages to further exam-
ine their needs and interests. Also, the BYOD approach should be researched in
future studies. The tools compatibility with different networks and safety config-
urations, which prevented the initially planned bring-your-own-device (BYOD)
approach is a mayor limitation and might have affected the results negatively.
Also a note taking feature, e.g. implemented as a personal recording of the ses-
sion to a reusable format, might be promising as students mentioned that they
would like to take notes and SlideWiki focuses on higher education.
Acknowledgements. This work was supported by the European Union’s Horizon

2020 program No 688095; the German Federal Ministry of Education and Research
for the EVELIN project under grant No 01PL17022E; the German Federal Ministry
of Education and Research for the tech4comp project under grant No 16DHB2102.
The Authors thank Professor Philipp Brune for establishing their connection and his
support in this study by including the tool in one of his lectures in summer term 2019.
The authors are responsible for the content of this publication.
Author Statement. All authors contributed to this work equally.
Data Deposition & Supplemental Online Material. SlideWiki itself is Open-

Source and the source-code is hosted at https://github.com/slidewiki/. SlideWiki is
officially hosted at https://slidewiki.org and includes the same feature set as the insti-
tutional one, used in this study.
References
1. López-Pérez, M.V., Pérez-López, M.C., Rodrı́guez-Ariza, L.: Blended learning in
higher education: students’ perceptions and their relation to outcomes. Comput.
Educ. 56(3), 818–826 (2011)
2. Drummer, J., Hambach, S., Kienle, A., Lucke, U., Martens, A., Müller, W., Rens-
ing, C., Schroeder, U., Schwill, A., Spannagel, C., et al.: Forschungsherausforderung
des E-Learnings. In: Rohland, H., Kienle, A., Friedrich, S. (eds.) DeLFI 2011 - Die
9. e-Learning Fachtagung Informatik, pp. 197–208. Gesellschaft für Informatik e.V,
Bonn (2011)
3. Meissner, R., Junghanns, K., Martin, M.: A decentralized and remote controlled
webinar approach, utilizing client-side capabilities: to increase participant limits
and reduce operating costs. In: Proceedings of the 14th International Conference
on Web Information Systems and Technologies - Volume 1: WEBIST, pp. 153–160.
INSTICC, SciTePress (2018)
4. Elias, M., James, A., Ruckhaus, E., Suarez-Figueroa, M.C., de Graaf, K.A., Khalili,
A., Wulff, B.M., Lohmann, S., Auer, S.: SlideWiki - towards a collaborative and
accessible slide presentations. In: EC–TEL 2018, 13th European Conference on
Technology Enhanced Learning. Practitioner Proceedings, Leeds, UK. Fraunhofer
(2018)
5. Mikroyannidis, A.: Collaborative authoring of open courseware with slidewiki: a
case study in open education. In: EDULEARN 2018 Proceedings, vol. 1, pp. 2000–
2007 (2018)
6. Raspopovic, M., Cvetanovic, S., Medan, I., Ljubojevic, D.: The effects of integrat-
ing social learning environment with online learning. Int. Rev. Res. Open Distrib.
Learn. 18, 141–160 (2017)
7. Kergel, D., Heidkamp, B.: Digitalisierung der Lehre – Chancen für eBologna, pp.
145–160. Springer Fachmedien Wiesbaden, Wiesbaden (2018)
8. Siemens, G.: Connectivism: a learning theory for the digital age. Int. J. Instr.
Technol. Distance Learn. 2(1), 3–10 (2005)
9. Dräger, J., Müller-Eiselt, R.: Die digitale Bildungsrevolution - Der radikale Wandel
des Lernens und wie wir ihn gestalten können, 4th edn. DVA, München (2015)
10. Prilla, M., Nolte, A.: Fostering self-direction in participatory process design. In:
Proceedings of the 11th Biennial Participatory Design Conference, PDC 2010, pp.
227–230. ACM, New York (2010)
11. King, W.R., He, J.: A meta-analysis of the technology acceptance model. Inf.
Manag. 43(6), 740–755 (2006)
12. Schepers, J., Wetzels, M.: A meta-analysis of the technology acceptance model:
investigating subjective norm and moderation effects. Inf. Manag. 44, 90–103
(2007)
13. Birken, T.: IT-basierte Innovation als Implementationsproblem. Evolution und
Grenzen des Technikakzeptanzmodell-Paradigmas, alternative Forschungsansätze
und Anknüpfungspunkte für eine praxistheoretische Perspektive auf Innovation-
sprozesse. ISF München (2014)
14. Davis, F.D., Bagozzi, R.P., Warshaw, P.R.: User acceptance of computer technol-
ogy: a comparison of two theoretical models. Manag. Sci. 35(8), 982–1003 (1989)
15. Wilhelm, D.B.: Pre-Test eines Modells zur Erklärung der C.9 Nutzerakzeptanz von
web-basierten “sozialen” Unternehmensanwendungen. In: GeNeMe 2009 Gemein-
schaften in Neuen Medien, TU Dresden, 01./02.10.2009, Virtuelle Organisation
und Neue Medien 2009, pp. 203–214. TU Dresden (2009)
16. Venkatesh, V., Davis, F.: A theoretical extension of the technology acceptance
model: four longitudinal field studies. Manag. Sci. 46, 186–204 (2000)
17. Nadler, J.T., Weston, R., Voyles, E.C.: Stuck in the middle: the use and interpre-
tation of mid-points in items on questionnaires. J. Gen. Psychol. 142(2), 71–89
(2015)
18. Schöneck, N.M., Voß, W.: Das Forschungsprojekt: Planung, Durchführung und
Auswertung einer quantitativen Studie. Springer, Wiesbaden (2015)
Adaptivity: A Continual Adaptive Online
Knowledge Assessment System
Miran Zlatović and Igor Balaban(&)
Faculty of Organization and Informatics, University of Zagreb, Varaždin, Croatia

{miran.zlatovic,igor.balaban}@foi.hr
Abstract. The main goal of this paper is to provide an insight into imple-
mentation of a model for continual adaptive online knowledge assessment
throughout Adaptivity, a web-based application. Adaptivity enables continual
and cumulative knowledge assessment process, which comprises of a sequence
of at least two (but preferably more) interconnected tests, carried-out throughout
a reasonably long period of time (i.e. one semester). It also provides personal-
ized post-assessment feedback, which is based on each student’s current results,
to guide each student in preparations for the upcoming tests. In this paper, we
provide description of adaptation model, reveal the design of Adaptivity and
results of testing of the proposed model.
Keywords: Online knowledge assessment Adaptive knowledge assessment

Continual knowledge assessment Improving classroom teaching Post-
secondary education
1 Introduction
The adaptive online education is highly represented in current scientific and profes-
sional research. The main guiding principle of most research in this area is to provide a
complete e-learning system which is (i) capable of selecting and providing the
appropriate learning content for each individual, in order to (ii) improve the effects of
each individual’s education (see for example [1–3]).
When analyzing the research in the scope of adaptive online knowledge assess-
ment, it can be noted that most efforts are focused on studying various aspects of
adaptability within a single knowledge assessment, usually within a self-assessment
and/or formative assessment [4–8]. However, teachers should be able to continuously
monitor and evaluate students’ progress and to adapt their teaching to the needs of
different groups of students accordingly, as suggested by the Bologna Process.
In respect to the requirements identified in previous sections this paper does not
consider modelling of an entire adaptive learning system but focuses only on the model
and the example of an adaptive online knowledge assessment system (hereafter referred
to as Adaptivity). Adaptivity is designed to guide the individual towards continuous
improvement in achievement of learning goals, by announcing and applying types of
assessment which are appropriate for the learning content that is being assessed.
Adaptivity: A Continual Adaptive Online Knowledge Assessment System 153
2 Research Background
Literature review did not reveal nowhere near as many examples of a research related to
the adaptive knowledge assessment that spans across the series of assessments (i.e.
continuous/continual adaptive knowledge assessment). Raman and Nedungadi [9]
describe continuous formative evaluation within Amrita Learning ALS, where multiple
assessments are carried-out in an adaptive way, but since each assessment covers
different learning topics, it cannot be classified as fully continual adaptive assessment,
where at least a portion of the subsequent assessment adaptively depends upon the
results of the previous assessment(s).
Grundspenkis and Anohina [10] describe an adaptive learning and assessment
system where concept maps are used as a more machine-friendly replacement for
essays. Course contents are introduced gradually in time, through multiple stages.
Adaptive knowledge assessments take place between stages, but although these
assessments encompass contents from all available stages (similarity with our
approach), the adaptivity is still limited to a single assessment. Also, there is no
evidence that an assessment that takes place during later stages takes into consideration
the results from the assessment conducted during earlier stages.
There are also examples of adaptive and continuous assessment within commercial
e-learning platforms – e.g. Khan Academy. Hu [11] describes Academy’s approaches
towards assessing students’ mastery of a topic. They use different proficiency model,
which uses logistic regression to select next task, based on user’s previously solved
tasks and current proficiency level. Despite being adaptive over the series of assess-
ments (tasks), these approaches also do not systematically include parts of previous
topics during subsequent assessments.
In a broader field of ALSs and intelligent tutoring systems (ITSs), there is a
common distinction between micro- and macro-adaptation. VanLehn [12] suggests
that, on ITS level, macro-adaptation focuses on a global task selection, while micro-
adaptation is concerned with in-task interactions. Within ITS, knowledge assessment is
usually placed on a micro-level and its results are used to update learner (user) models,
upon which the rest of the macro-adaptation is derived [13, 14].
Considering above-mentioned findings, continual knowledge assessment systems
that include the elements of adaptivity within a series of connected assessments have
not yet been sufficiently investigated. As one of the scientific contributions of this
study, this research aims to provide additional insights about such systems.
3 Adaptivity: Model and Design
Although the main aim of this paper is to propose a design of an online knowledge
assessment system, firstly we give short overview of the underlying model (Adaptivity
Model) on which the online system was built upon.
154 M. Zlatović and I. Balaban
3.1 Adaptivity Model

The basic layout of the proposed Adaptivity model is shown in Fig. 1. The cognitive
level is a label assigned to a learning goal, according to Bloom’s Taxonomy [15]. It is
used to classify learning goals with regard to the cognitive level which is being
assessed by particular learning goal.
Cognitive Learning Test creation

levels objects (Goals in initial Test
phase and goals in solving
adaptive phase)
Learning
Questions
goals
Test evalua-
tion
Learning goals
Levels of learning goals achievement
achievement
Feedback towards students and teachers

Legend:
Data about element X are used during
System activity activity Y
a) Data about el. X is connected to data

System element about el. Y
b) Activity X follows activity Y
Fig. 1. Basic elements of the proposed Adaptivity model
The learning objects represent the thematic units of learning content, to which
learning goals are connected.
The questions element represents the test questions database. Questions are
assigned to the learning goals. Multiple types of questions are supported – multiple-
choice questions with both single (SC) and multiple (MC) correct answers, matching
questions (MATCH), fill-in the blanks questions (FILL) and free answers essay-type
(ESSAY) questions. To each question is also assigned the qualitative label (i.e. level)
indicating its difficulty in a context of assessing the related learning goal [16] – diffi-
culty level “1” (DL1) represents an easy question, DL2 represents the question of
medium difficulty, while DL3 represents a difficult question.
The test creation element is a central part of the system and takes into consider-
ation all the other main elements of the system except for feedback.
The learning goals achievement element is calculated during the test evaluation
activity. It represents the quantitative indicator of individual’s success in achieving
learning goal [16]. It is expressed as a percentage scale, with thresholds set to mimic
the traditional grading systems: 0–49, 99% (Fail, F or 1), 50–62, 49% (Sufficient, D or
2), 62, 5–74, 99% (Good, C or 3), 75–87, 49% (Very good, B or 4), 87, 5%–100%
(Excellent, A or 5).
The feedback towards the students visualizes the individual achievement levels
related to the learning goals included in assessment and provides personalized sug-
gestions describing what type and difficulty of questions will be used predominantly in
following adaptive iteration, during repeated assessment of old learning content. The
feedback towards the teachers shows which questions are most difficult to solve, etc.
3.2 Flow of Continual Adaptive Online Knowledge Assessment

Since the assessment is being conducted in a continual manner, first iteration in the
assessment cycle is non-adaptive, as illustrated in Fig. 2. In this iteration all students
receive identical assessment structure, i.e. it is the teacher alone who decides: (i) which
learning objects and learning goals will be assessed, (ii) the difficulty of questions
being used in assessment of particular goals and (iii) the number of questions (of
desired type and difficulty) included in the assessment. According to teacher’s inten-
tion, in initial phase of the assessment of a learning goal all students could be given:
1. Identical set of fixed questions included manually in the assessment, according to
above mentioned criteria, or
2. Randomly selected questions, according to above mentioned criteria, or
3. A mixture of fixed questions and randomly selected questions.
3rd set of learn- LG3n – questions

… Legend:
ing goals (course
topics) LG32 – questions Initial phase
LG31 – questions
LG2n – questions LG2n – questions Adaptive
2nd set of learn-
… … phase
ing goals (course
LG22 – questions LG22 – questions
topics)
LG21 – questions LG21 – questions
1st set of learning LG1n – questions LG1n – questions LG1n – questions
goals (course … … …
topics) LG12 – questions LG12 – questions LG12 – questions
LG11 – questions LG11 – questions LG11 – questions
1st assessment 2nd assessment 3rd assessment
Fig. 2. Flow of the continual and cumulative adaptive knowledge assessment
By analyzing individual assessment results from the first iteration, it is possible to

personalize each student’s questions structure for the following iteration. Second and
every other iteration of the assessment includes:
• First (initial) assessment of new learning objects – these objects enter the initial
assessment phase (black squares in Fig. 2), as described in previous paragraphs, and
• Repeated assessment of learning objects which have been included in the previous
iteration – these objects enter the adaptive assessment phase (light grey squares in
Fig. 2), during which the system automatically selects the questions (number, type
and difficulty), based on built-in adaptivity rules (R1–R5) which consider student’s
previous level of learning goals achievements for that object (for details see [17]):
(A) Rules R1–R3 to select the difficulty of questions based on previous iteration
(R1 if student failed to reach specific LG, R2 if the student was sufficient or
good, and R3 is student was very good or excellent)
(B) Rule R4 to decrease the number of questions used during the adaptive
phase– to avoid the inevitable question inflation caused by the repeated
assessment of all learning goals from all previous iterations
(C) Rule R5 to increase the number of questions used for individuals with poor
achievement (modification of R4), but not to exceed the total number of
questions determined by R4.
3.3 Design of the Adaptivity Web Application

Adaptivity, the web application for continual adaptive online knowledge assessment, is
developed based on the proposed Adaptivity model. Application is based on
Microsoft ASP.NET platform (Windows Server, MS SQL Server and ASP.NET) and
consists of several core modules, outlined in Fig. 3 and further described in following
sections.
Adaptive Online
Web interface
Web interface
User
Assessment
Management
Structure Design
Module
Module
Teacher
Web interface
Web interface
Test Instances
Questions and
Database Evaluation
Answers Module
Module
Teacher
Learning Objects
Web interface
Web interface
Test Instances
and Learning
Generation
Goals Design
Module
Module Student
Fig. 3. Basic design of Adaptivity web application
The User Management Module

This module is built upon the standard Membership Framework (MF) component of
the ASP.NET framework. The MF functionality is further customized to support
external Open-LDAP user authentication and additional user attributes, as required by
the institutional ICT infrastructure. The application supports role-based user manage-
ment: students and teachers.
The Questions and Answers Module

This module is used to maintain the database of test questions and answers. It supports
5 different question types, as described in Sect. 3.1. For “fill-in the blanks” type of
questions, system also enables the definitions of the synonyms for each term – this
option is implemented to increase the reliability of automated scoring for this question
type. The teacher can manually review and change all the scores for fill-in questions,
should students use any unforeseen, but correct synonym.
For each question, the teacher must also define the points to be awarded for the
correct answer and the maximum time (expressed in minutes) during which the student
must provide the answer to it (TimeToSolve attribute).
The individual nature of each student’s test in the adaptive phase (where different
amount, type and difficulty of the questions can be used) made it necessary to introduce
the above-mentioned TimeToSolve attribute at the question level. Because of these
individual differences, it is not possible to set one common duration to solve the entire
test, which would apply to all students. Therefore, the teacher makes time estimates per
question and the total time to solve the test is later calculated per student, as the lump
sum of all the TimeToSolve values of the questions selected for that student.
The Learning Objects and Learning Goals Design Module
This module lays the foundation for conducting the granular assessment at the level of
individual learning goals. The teacher creates all the learning objects and all the
learning goals. Since the system allows for one learning object to be assessed by using
one or more learning goals, the teacher must also assign learning goals to the learning
objects. The teacher then assigns the questions from the questions database to the
learning goals.
The Adaptive Online Assessment Structure Design Module
The actual knowledge assessments are created in a form of assessment chains (se-
quences of N iterations, see Fig. 2 in Sect. 3.2). If the teacher has already created all
the learning objects, goals and Q&A database, these are the steps to create the new
assessment:
1. Decision whether new chain of the assessments will be started or an adaptive sequel
to the last assessment will be created in an existing chain of the assessments.
2. Selection of the questions used to assess only the new learning objects and goals,
(which are in the initial, first phase of an assessment) - either as fixed amount of
manually selected questions (given to all students) or as randomly selected amount
of N questions from that goal’s question pool (students don’t get identical
questions).
The Individual Test Instances Generation Module

This module creates the individualized instances of assessments, personalized for each
student. The test itself is created at the very moment the student wants to start solving
the test, based on the built-in adaptive rules (see Sect. 3.2).
Firstly, the questions for the LGs (learning goals) which are in the initial phase of
the assessment are included (statically, based on teacher’s specs, see Sect. 3.2). Then
the system dynamically applies adaptive rules to select the questions for the LGs which
are currently in the adaptive phase of the assessment (based on student’s prior levels of
achievements at those LGs).
These steps are performed automatically by the system, per student, in real-time and
without teachers’ intervention.
The Test Instances Evaluation Module
The answers are automatically evaluated upon test submission, except for essay
questions where manual scoring is required. Students get immediately the partial
results, related to the portion of test that could be automatically evaluated. If there were
no essay questions in the test, levels of achievement for all the assessed goals would
also be determined and student would receive complete feedback about his/her success.
During the evaluation phase, Adaptivity uses pre-defined interval scale with the
achievement thresholds, which mimics the usual academic grading system. The
achievement level of each learning goal is calculated as a percentage of total points
awarded within the maximum points available for all the question used to assess that
goal.
When all the achievement levels have been calculated, student can see the complete
report of his/her success. In the context of adaptive assessment, the most important part
of the report is the detailed breakdown of the success at the level of individual learning
goals. The achievement levels are displayed alongside each of the assessed goals. At
the end of the report, there are direct remarks describing what type and what difficulty
of the questions will be used in the next iteration, during repeated assessment of these
learning goals. This information should incite the student to adjust his/her learning
strategy for the next assessment [18], in order to (i) improve knowledge levels for those
goals which had poor achievement levels or (ii) to maintain or improve high(er) levels
of knowledge for those goals which had satisfactory or great achievement levels.
4 Testing the Adaptivity Model
Adaptivity was used to test the effectiveness of the underlying Adaptivity model. The
research involved all information systems and technology students who regularly
attended classes at the undergraduate course Y held at institution X. The total popu-
lation of students, i.e. convenience sample, was divided into two groups. The experi-
mental group (E) used Adaptivity for all assessments prescribed by the curriculum of
the course Y. The control group (C) was given assessments in a traditional paper-and-
pencil form, consisting of several essay-type questions.
Both groups of students have shared the same learning contents (the same learning
goals) in every assessment. Also, both groups were given cumulative type of the
assessment (see Fig. 2, Sect. 3.2), where each subsequent test had included new
learning topics along with the old ones. This should ensure that the different type of
assessment process (i.e. Adaptivity vs. pen-and-paper) was the major and most sig-
nificant difference between two groups.
To enable the comparison of the points between two groups, all the results were
converted into percentages. Classic written tests used in the C-group always allowed
for a maximum of 21 points, while within the Adaptivity maximum points available
varied from student to student, because in adaptive stages all the students did not any
longer receive the same number of questions per learning goal.
The Table 1 shows that students in the E-group achieved higher average results (in
percentage) in all three tests with an increase of at least 11.05% (in the second test), up
to a maximum of 19.07% (in the third test). Finally, the E-group achieved an average of
15.89% better results than the C-group.
Table 1. Descriptive statistics of the assessment results for control (N = 104) and experimental
group (N = 78).
Group N Avg. score (in %) Std. dev.
Test1 C 104 45.42 20.365
E 78 59.29 13.396
Test2 C 104 46.10 20.912
E 78 57.15 14.997
Test3 C 104 36.21 23.854
E 78 55.28 14.450
Total C 104 42.58 18.385
E 78 58.47 12.881
The statistical significance of the observed differences was checked by using t-test
with two independent samples (student groups): one that used the Adaptivity appli-
cation and the other that did not use it. The significance of all four observed variables
(Table 2) is below the threshold of 0.01, which indicates that the difference in means
for both groups is statistically significant (p < 0.01). Therefore, it is shown that the
experimental group achieved significantly better results (15.89%) than the control
group.
Table 2. Results of t-test (independent samples) for the assessment results (control vs.
experimental group).
Levene test T-test
F Sig. t df Sig. (2-tail) Means diff.
(a)
Test1 15.062 0.000 −5.534 177.226 0.000 −13.878
Test2(a) 7.210 0.008 −4.149 179.677 0.000 −11.046
Test3(a) 25.938 0.000 −6.681 173.037 0.000 −19.071
Total(a) 12.891 0.000 −6.855 179.242 0.000 −15.896
(a)
Inequality of variances is assumed (based on Levene’s test)
Prior marks of the students from the prerequisite course Z were also analyzed, to
avoid possible comparisons between two groups of students whose academic capa-
bilities may be drastically unequal. The analysis of the grades showed that the
significant difference between average marks for the C-group (2.183) and E-group
(2.179) does not exist.
5 Conclusion
Literature review has revealed that adaptive e-assessment systems are mostly based on
a single test and adaptivity is thus applied within a single test, considering students’
answer after each question. In addition, most of the e-assessment types mentioned in
the previous research are usually self-assessment or peer assessment and are sometimes
mixed with the traditional (scheduled) formative tests in a classroom.
However, similarity between the existing single-test adaptivity e-assessment sys-
tems and the Adaptivity model proposed in this paper is also evident: using Bloom’s
taxonomy and ontologies to create tests [4], generating tests based on students’ level of
knowledge [19], and testing feedback effects with multiple-tier tests [20].
Despite similarities, there are also several main differences that reflect the novelty
of our approach:
1. Adaptivity model involves adaptation of every subsequent test considering results
from previous iterations (tests) and learning goals. Therefore, we consider adap-
tation between tests rather than within single test, which means we are dealing with
continual e-assessment.
2. The model considers realization of the learning goals throughout multiple tests and
adapts the types of questions accordingly.
3. Feedback given to student is carefully coined to facilitate the desired learning
strategies [18], to improve student’s success considering all iterations (tests).
In this paper we demonstrated that the use of Adaptivity has helped students to
achieve better results at the end of semester. However, the practical implementation of
Adaptivity is adjusted to fit real academic practice within blended online education and
was piloted within such course. Assessment process is adjusted to fit one specific form
of continuous monitoring of students’ activities in the context of high-education
classes. Various formal obstacles may limit the application of this model of knowledge
assessment in different types of institutions. Therefore, caution is advised when
applying it to the environments that practice full online education or do not use con-
tinual assessments.
References
1. Graf, S., Kinshuk: Advanced adaptivity in learning management systems by considering
learning styles. In: WI-IAT 2009, IEEE/WIC/ACM International Joint Conferences on Web
Intelligence and Intelligent Agent Technologies 2009, Milan, Italy, vol. 3, pp. 235–238
(2009)
2. Hafidi, M., Bensebaa, T., Trigano, P.: Developing adaptive intelligent tutoring system based
on item response theory and metrics. Int. J. Adv. Sci. Technol. 43, 1–14 (2012)
3. Ahuja, N.J., Sille, R.: A critical review of development of intelligent tutoring systems:
retrospect, present and prospect. Int. J. Comput. Sci. Issues 10(2), 39–48 (2013)
4. Ying, M.H., Yang, H.L.: Computer-aided generation of item banks based on ontology and
bloom’s taxonomy. In: Li, F., et al. (eds.) Advances in Web Based Learning - ICWL 2008.
LNCS, vol. 5145, pp. 157–166. Springer, Heidelberg (2008)
5. Huang, Y.M., Lin, Y.T., Cheng, S.C.: An adaptive testing system for supporting versatile
educational assessment. Comput. Educ. 52(1), 53–67 (2009)
6. Chrysafiadi, K., Virvou, M.: Create dynamically adaptive test on the fly using fuzzy logic.
In: 2018 9th International Conference on Information, Intelligence, Systems and Applica-
tions (IISA), pp. 1–8. IEEE (2018)
7. Snytyuk, V., Suprun, O.: Adaptive technology for students’ knowledge assessment as a
prerequisite for effective education process management. In: ICTERI, pp. 346–356 (2018)
8. Mangaroska, K., Vesin, B., Giannakos, M.: Elo-rating method: towards adaptive assessment
in e-learning. In: 2019 IEEE 19th International Conference on Advanced Learning
Technologies (ICALT), vol. 2161, pp. 380–382. IEEE (2019)
9. Raman, R., Nedungadi, P.: Adaptive learning methodologies to support reforms in
continuous formative evaluation. In: 2010 International Conference on Educational and
Information Technology, vol. 2, pp. V2–429. IEEE (2010)
10. Grundspenkis, J., Anohina, A.: Evolution of the concept map based adaptive knowledge
assessment system: implementation and evaluation results. J. Riga Techn. Univ. 38, 13–24
(2009)
11. Hu, D.: How Khan Academy is using Machine Learning to Assess Student Mastery (2011).
http://david-hu.com/2011/11/02/how-khan-academy-is-using-machine-learning-to-assess-
student-mastery.html. Accessed 22 Feb 2019
12. VanLehn, K.: The behavior of tutoring systems. Int. J. Artif. Intell. Educ. 16(3), 227–265
(2006)
13. Rus, V., Baggett, W., Gire, E., Franceschetti, D., Conley, M., Graesser, A.: Towards learner
models based on learning progressions (LPs) in DeepTutor. In: Sottilare, R.A., et al. (eds.)
Design Recommendations for Intelligent Tutoring Systems: Volume 1 – Learner Modeling,
pp. 183–192. Army Research Laboratory, Orlando (2013)
14. Chrysafiadi, K., Troussas, C., Virvou, M.: A framework for creating automated online
adaptive tests using multiple-criteria decision analysis. In: 2018 IEEE International
Conference on Systems, Man, and Cybernetics (SMC), pp. 226–231. IEEE (2018)
15. Bloom, B.S., Engelhart, M.D., Furst, E.J., Hill, W., Krathwohl, D.R.: Taxonomy of
Educational Objectives, The Classification of Educational Goals, Handbook I: Cognitive
Domain. McKay Press, Midland (1956)
16. Hatzilygeroudis, I., Koutsojannis, C., Papachristou, N.: Adding adaptive assessment
capabilities to an e-learning system. In: SMAP 2006, First International Workshop on
Semantic Media Adaptation and Personalization, Athens, Greece, pp. 68–73 (2006)
17. Zlatović, M., Balaban, I.: Personalizing questions using adaptive online knowledge
assessment. In: eLearning 2015-6th International Conference on e-Learning, Belgrade,
pp. 185–190 (2015)
18. Zlatović, M., Balaban, I., Kermek, D.: Using online assessments to stimulate learning
strategies and achievement of learning goals. Comput. Educ. 91, 32–45 (2015)
19. Conejo, R., Guzmán, E., Trella, M.: The SIETTE automatic assessment environment. Int.
J. Artif. Intell. Educ. 26(1), 270–292 (2016)
20. Maier, U., Wolf, N., Randler, C.: Effects of a computer-assisted formative assessment
intervention based on multiple-tier diagnostic items and different feedback types. Comput.
Educ. 95(1), 85–98 (2016)
The First Programming Language
and Freshman Year in Computer Science:
Characterization and Tips for Better
Decision Making
Sónia Rolland Sobral(&)
REMIT, Universidade Portucalense, Porto, Portugal

[email protected]
Abstract. The ability to program is the “visible” competency to acquire in an

introductory unit in computer science. However, before a student is able to write
a program, he needs to understand the problem: before formalizing, the student
must have to (be able) to think, (be able) to solve and (be able) to define. At an
early stage of learning there are no significant differences between programming
languages.
The discussion of the early programming language continues: probably never
will be a consensus among academics. The Association for Computing
Machinery (ACM) and Institute of Electrical and Electronics Engineers (IEEE)
computer science curriculum recommendations haven’t clearly defined which
programming language to adopt: it is the course directors and teachers who must
make this choice, consciously and not only following the trends.
This article presents a set of items that should be considered when you make a
programming language choice for the first programming unit in higher education
computer science courses.
Keywords: Programming languages Undergraduate studies Introduction to

programming
1 Introduction
Programmability is the “visible” skill to be acquired in an introductory unit in computer

science. Programming can be considered an art [1], a science [2], a discipline [3] or
even the science of abstraction [4]. However, using a programming language is no
more than a method for the programmer can communicate instructions to the computer.
Before designing a program, the student must know the problem, know how to use
the necessary tools to solve the problem that need to be solved with the machine, such
as methods used to specify specifications and rigorous solutions that can be imple-
mented on the computer. For this, the student will also have to learn one or more
programming languages and paradigms in order to use the programming notions and
systematize the use of data structures and algorithms to solve different categories of
problems [5].
The First Programming Language and Freshman Year in Computer Science 163
Students often have the perception that the focus is on learning the syntax of the
programming language, leading them to focus on implementation activities rather than
activities such as planning, drawing, or testing [6].
The art of programming involves four steps [7]:
(a) To Think: the conceptualization and analysis phase in which problems are divided
into small and easily intelligible processes or tasks, the modular structure, whose
organization must follow a descending programming logic, Top-Down [8];
(b) To Solve: Translate Top-Down into Algorithm [1], which incorporates solution
rules using pseudo code;
(c) To Define: Using variables and data structures, characterize the data model to be
used in the algorithm;
(d) To Formalize: translate the algorithm into a programming language, its imple-
mentation and execution on the computer.
Then it comes the most important phase, the true moment of truth: Does the
program run, is error-free and give the correct result? And how are you sure that the
result is “the” or “probably” the correct solution?
The following Table 1 shows how each of ten of the most well-known program-
ming languages write the famous “Hello, World!”.
Table 1. “Hello World!” ten different programming languages.

Prog. Language Write Hello World
C printf (“‘Hello World!”);
C# Console.WriteLine(“Hello World!”);
C++ cout<<“Hello World”
COBOL display “Hello world!”.
Fortran print *,“Hello world!”
Java System.out.println(“Hello World!”);
JavaScript document.write(“Hello world!”);
Pascal writeln (‘Hello World!’);
Python print “Hello, World!”
Visual Basic.NET Console.WriteLine(“Hello world!”)
Each of the ten programming languages presented in the previous table has a
different notation, however it’s quite similar in a basic algorithm like “Hello World!”.
Some say that programming is very difficult [9, 10] while for others it may be easy
[11].
Success is achieved through a good deal of study, research, planning, persistence
and preferably a passion for the activity.
This article is divided into five parts: this introduction, the second part with Pro-
gramming languages: concept and characterization; the third part with Evolution of
programming languages in undergraduate computer science studies; the fourth part
164 S. R. Sobral
with Choosing the Initial Programming Language and the last part with conclusions
and future work.
2 Programming Languages: Concept and Characterization
A programming language is a system that allows the interaction between man and the
machine, being “understood” by both. It is a formal language that specifies a set of
instructions and rules. Programming languages are the medium of expression in the art
of computer programming. Program writing must be succinct and clear, because pro-
grams are meant to be included, modified, and maintained throughout life: a good
programming language should help others to read programs and to understand how
they work [12]. A program is a set of instructions that make up a solution after being
coded in a programming language [13].
There are several reasons why thousands of high-level programming languages
exist and new ones continue to emerge [14]:
Evolution: The late 1960s and early 1970s saw a revolution in “structured pro-
gramming,” in which the GoTo-based flow control of languages such as FORTRAN,
COBOL, and Basic gave way to while loops, case statements (switch). In the late
1980s, Algol, Pascal and Ada began to give way to the object-oriented languages like
Smalltalk, C++ and Eiffel. And so on.
– Special Purposes: Some programming languages are designed for specific purposes.
C is good for low level system programming. Prolog is good for reasoning about
logical relationships between data. Each can be successfully used for a wide range
of tasks, but the emphasis is clearly on the specialty.
– Personal preference: Different people like different things. Some people love C
while others hate it, for example.
According to Stack Overflow Annual Developer Survey [15], with over 90,000
answers to over 170 countries, by 2019 the most widely used programming language is
JavaScript (Table 2).
Table 2. Top15, Programming languages most used in 2019 [15].

PL %
JavaScript 67.8%
HTML/CSS 63.5%
SQL 54.4%
Python 41.7%
Java 41.1%
Bash/Shell/PowerShell 36.6%
C# 31.0%
PHP 26.4%
C++ 23.5%
(continued)
PL %
TypeScript 21.2%
C 20.6%
Ruby 8.4%
Go 8.2%
Assembly 6.7%
Swift 6.6%
In September 2019, the TIOBE Programming Community index [16].

Indicator of the popularity of programming languages featured Java as the most
popular (Table 3), followed by C and Python.
Table 3. Top7, Indicator of popularity of programming languages, TIOBE [16].

LP Ratings
Java 16,66%
C 15,21%
Python 9,87%
C++ 5,64%
C# 3,40%
Visual Basic .NET 3,29%
JavaScript 2,13%
The technology of the most searched electronic sites (Table 4) according to

Wikipedia1 (Wikipedia, 2019) is also varied in the use of back-end languages; How-
ever, JavaScript is almost always used on the front end.
3 Evolution of Programming Languages in Undergraduate

Computer Science Studies
Computer science became a recognized academic field in October 1962 with the cre-
ation of Purdue University’s first department [18]. The first curriculum studies
appeared in March 1968, when the Association for Computing Machinery
(ACM) published an innovative and necessary document, Curriculum 68: Recom-
mendations for academic programs in computer science [19], with early indications of
curriculum models for programs in computer science and computer engineering.
Prerequisites, descriptions, detailed sketches, and annotated bibliographies were
1
Wikipedia is not a reliable source of information because it has collaborative features!
166 S. R. Sobral
Table 4. The technology used in the most searched websites [17].

Web Site Back-end Language
Amazon.com Java, C++, Perl
Bing C++, C#
eBay.com Java, JavaScript, Scala
Facebook.com Hack, PHP (HHVM), Python, C++, Java, Erlang, D, XHP, Haskell
Google.com C, C++, Go, Java, Python
Linkedin.com Java, JavaScript, Scala
MSN.com C#
Pinterest Python (Django), Erlang
Twitter.com C++, Java, Scala, Ruby
Wikipedia.org PHP, Hack
WordPress.com PHP
Yahoo PHP
YouTube.com C, C++, Python, Java, Go
included for each of these courses. As initial unit, it presented B1. Introduction to
Computing (2-2-3)2 in which an algorithmic language was proposed, recommending
that only one language be used or two “in order to demonstrate the wide diversity of the
computer languages available”; “Because of its elegance and novelty, SNOBOL can be
used quite effectively for this purpose.”
With the emergence of many new courses and departments, ACM published a new
report, Curriculum’78: recommendations for the undergraduate program in computer
science [20], updating Curriculum 68. It presented for the first time the denomination
CS1: Computer Programming I (2-2-3): “The emphasis of the course is on the tech-
niques of algorithm development and programming with style. Neither esoteric features
of a programming language nor other aspects of computers should be allowed to
interfere with that purpose.”
Despite the importance of Curriculum’78 there has been much discussion, partic-
ularly regarding the sequence CS1 and CS2. In 1984 a new report is published:
“Recommended curriculum for CS1, 1984” [21] to detail a first computer science
course that emphasizes programming methodology and problem solving.” This report
refers Pascal, PL/1 e Ada: “These features are important for many reasons. For
example, a student cannot reasonably practice procedural and data abstraction without
using a programming language that supports a wide variety of structured control fea-
tures and data structures”. They said that “Although FORTRAN and BASIC are widely
used, we do not regard either of these languages as suitable for CS1” and ALGOL
“does satisfy the requirements but is omitted from our list of recommended languages
simply because it is no longer widely used or supported.”
2
(2-2-3) two hours of lectures and two hours of laboratory per week for a total of three semester hours
of credit.
In 1991 [22] IEEE (Institute of Electrical and Electronics Engineers) and ACM
joined for a new document. This document emerged by breaking with some of the
concepts of previous documents, presenting a set of individual knowledge units cor-
responding to a topic that should be addressed at some point in the undergraduate
program. In this way, institutions have considerable flexibility in setting up course
structures that meet their particular needs.
In 2001 a new document was published [23]. This document questioned the
programming-first of previous documents, as early programming approaches may lead
students to believe that writing a program is the only viable approach to solving
problems using a computer and that focus only on programming reinforces the com-
mon misperception that “computer science” equals programming. They said “In fact,
the problems of the programming-first approach can be exacerbated in the objects-first
model because many of the languages used for object-oriented programming in
industry—particularly C++, but to a certain extent Java as well—are significantly more
complex than classical languages. Unless instructors take special care to introduce the
material in a way that limits this complexity, such details can easily overwhelm
introductory students”.
In 2008 a new report is presented: “Computer Science Curriculum 2008: An
Interim Revision of CS 2001” [24]; security is strongly mentioned, making minor
revisions to the 2001 document. Curriculum’2008 reinforces the idea that “Computer
science professionals frequently use different programming languages for different
purposes and must be able to learn new languages over their careers as the field
evolves. As a result, students must recognize the benefits of learning and applying new
programming languages. It is also important for students to recognize that the choice of
programming paradigm can significantly influence the way one thinks about problems
and expresses solutions of these problems. To this end, we believe that all students
must learn to program in more than one paradigm”.
When referring to languages and paradigms, the “Computer Science Curricula
2013: Curriculum Guidelines for Undergraduate Degree Programs in Computer Sci-
ence” [25], says that the choice of programming languages seems to be depend on the
chosen paradigm and “There does, however, appear to be a growing trend toward
“safer” or more managed languages (for example, moving from C to Java) as well as
the use of more dynamic languages, such as Python or JavaScript.” “Visual pro-
gramming languages, such as Alice and Scratch, have also become popular choices to
provide a “syntax-light” introduction to programming; these are often (although not
exclusively) used with non-majors or at the start of an introductory course”. And “some
introductory course sequences choose to provide a presentation of alternative pro-
gramming paradigms, such as scripting vs. procedural programming or functional vs.
object-oriented programming, to give students a greater appreciation of the diverse
perspectives in programming, to avoid language-feature fixation, and to disabuse them
of the notion that there is a single “correct” or “best” programming language”.
It is clear that curriculum recommendations do not indicate which programming
language to adopt. However, it is always said that they should have the simplest
possible usability and syntax for better learning. Language choice has always been a
matter of concern to educators [26–30].
168 S. R. Sobral
FORTRAN was selected as a high level language for the first introductory courses;
especially those linked to engineering departments. The less widely used COBOL was
adopted by departments that were more closely linked to information systems [31]. At
that time you couldn’t talk about methodology: everything was just programming.
With the emergence of BASIC in 1964 [32] has led some departments to use this
language for introductory students. In 1972 almost all computer science degree pro-
grams used ALGOL, FORTRAN or LISP, while most data processing programs used
COBOL. In Britain, BASIC was also important. In the late 60’s, some departments
tried various languages like PL/I [33].
With Dijkstra’s manifest [34] structured programming begins to be discussed [35,
36]. With the emergence of the Pascal language [37] seems to become almost con-
sensual [31]: an almost written language for the purpose of programming learning,
using a very friendly development environment [38], and obviously because of the
proliferation of personal computers and the availability of Pascal compilers [39].
Pascal’s decline began in the late 1980s, early 1990s, with object-oriented pro-
gramming. And also because Pascal has a difficult document reuse, but also because
Pascal is not a “real world” language [39]. McCauley e Manaris [40].
They say that as a first language Pascal was used by 36% and C++ by 32% in
1995–1996 but 22% intended to make a switch to C++, C, Ada or Java. There are
several studies that present the evolution of the languages adopted in initial pro-
gramming curricular units [41, 42] and even lists of programming languages taught in
various courses [43].
In Portugal [44], in the 2016–2017 school year, the most common first-year pro-
gramming language sequence in 46 courses analyzed was C (48%), followed by Java
(22%), C and Haskell (9%), C and Java (4%), Scheme and Java (4%). There were also
residual sequences Excel and C, Python, Python, HTML and Java, Python and Java,
Schem and C++ and XML and Java. Regarding the ten Portuguese first cycle (or with
integrated master’s degree) courses in Computer Engineering considered most signif-
icant [45], it was found that the most common sequences were only Java or Python and
C (both with 30%), C (20%), Python and Java or Haskell and C (both with 10%).
According to the document “An Analysis of Introductory Programming Courses at
UK Universities” [46]:
– 73.8% use only one programming language; 21% reported using two.
– The most widely used language is Java (46%), followed by the “C family” (C, C++
and C#) (23.6%) and Python (13.2%). Javascript and Haskell are much less
adopted.
– The reason given by 82.7% of those who uses Java, was to be object oriented, while
72.7% of those using Python refer to the pedagogical benefits.
According to the document “Introductory Programming Courses in Australasia in
2016” [46] referring to the Universities of Australia and New Zealand:
– 48 courses studied: 15 used Java, 15 Python, 8 C, 5 C#, 2 Visual Basic and 2
Processing. The remaining ten use another programming language.
– The reasons given for choosing Python and Java are quite different: pedagogical
benefits for Python (67%), availability/cost (53%) and platform independence
(40%). The reasons given for using Java are industry-relevant (92%), object-
oriented (86%), and platform independence (62%).
“What language? - The choice of an introductory programming language” [47], A
study with 496 four-year courses in the United States, refere that Java is used by
41.94%, Python 26.45%, C++ 19.35%, C 4.52%, C# 0.65% and 7.10% by another. The
reasons for choosing were: Programming language features 26.19%, Ease of learning
18.81%, Job opportunities for students 14.76%, Popularity at the academy 13.10%,
Institutional tradition 8.57%, choice of advisory board 5.95%, availability of teachers
or scheduling restrictions 5%.
A 2016 study [48] analyse 218 colleges and 143 universities in 35 European
countries, indicating that the most commonly used programming language was C
(30.6%), following C++ (21.9%) and Java (20.7%).
A document [10] for 152 CS1 units from a number of different countries concludes
that Java is by far the most common CS1 language, used in 74 (49%) of the 152
programs. The second most frequent is Python, with 36 (24%). C++ comes in 30 (20%)
followed by C in 8 (5%), with the most obvious change being the rise of Python which
“probably occurred at the expense of Java and C++”.
Today, with few exceptions, the academy follows the “real world” and the “C
family” (C, C++, C#), Python, Java, and JavaScript are undoubtedly the programming
languages adopted in introductory programming units.
4 Choosing the Initial Programming Language
In 2004, Eric Roberts [49] commented that the languages, paradigms, and tools used to
teach computer science became increasingly complex; which pressures to cover more
material in an already overcrowded area. The problem of complexity is exacerbated by
the fact that languages and tools change rapidly, leading to profound instability in the
way computer science is taught. Roberts predicted that Java would be the way “we
must take responsibility for breaking this cycle of rapid obsolescence by developing a
stable and effective collection of Java-based learning applications that meet the needs
of the science education community”.
Dijkstra [50] wrote about the importance of the chosen programming language:
“the tools we are trying to use and the language or notation we are using to express or
record our thoughts, are the major factors determining what we can think or express at
all! The analysis of the influence that programming languages have on the thinking
habits of its users, and the recognition that, by now, brainpower is by far our scarcest
resource, they together give us a new collection of yardsticks for comparing the relative
merits of various programming languages.”
When selecting the first programming language for introductory programming
courses, it is important to consider whether it is suitable for teaching and learning. Over
time various pseudo-code languages have been created in search of the perfect teaching
language but no definitive solution has been found [51].
In document “Introductory Programming Subject in European Higher Education”
[48] discusses the need to teach introductory programming using educational
170 S. R. Sobral
programming languages. But in the past these languages have been discontinued: the
Pascal language being the most visible.
The programming language chosen for introductory programming courses often
seems like a religious or football issue. In reflection-teaser “The Programming Lan-
guage Wars” [52] it is even said that “Programming language wars are a major social
problem causing serious problems in our discipline” leading to “massively duplicating
efforts” and “reinventing the wheel constantly.” Choosing the best programming lan-
guage is often an emotional issue, leading to major debates [53] but for Guerreiro [54]
“It is up to us to have an open, exploratory attitude and at the same time not dog-
matically accept what those who make the most noise say. In fact, I think we should
even pass this on to students too, to help them develop their critical thinking, and to be
able, sooner or later, to choose the languages and tools that can best respond to their
needs”.
In fact, two of the most important points are pedagogical issues and student
preparation for the world of work. Parker e Devey [33] define them as pragmatic and
pedagogical: industry acceptance, market penetration as well as the employability of
graduates.
Keep in mind that “small programming” needs to be mastered before “large pro-
gramming” [55] since traditionally only “in the third or fourth year are faced with the
problems that arise in the design of large programs.” Collberg [55] said that the task of
choosing the initial language is not an easy task. It must obey factors such as simplicity,
expressiveness, suitability for tasks, availability of accessible resources, and reliable
compilers.
Programming languages are the fundamental basis of programming, but trends
change dramatically over time. Professionals will not use the same programming
language, or even the same programming model, for their entire professional career. In
addition, well-informed language choices can make a huge difference in programmer
productivity and program quality. Therefore, it is crucial that students master the
essential concepts of programming languages, so that they can choose and use lan-
guages based on a deep understanding of the abstractions they express and their ability
to solve programming problems [56].
Choosing the initial programming language to adopt should take into account
several points: Course objectives, Teacher preferences, available implementations, and
relationships with other course units, as well as the “real world”: students are often
more motivated to study a familiar language that is known to be requested by
employers [57].
Howatt [58] uses an evaluation method for programming languages using several
items: language design and implementation (accuracy and speed), human factors (us-
ability and ease), software engineering (portability, reliability and reuse) and applica-
tion mastery specific applications).
The paradigm chosen can be very important. [59] unless one adopts “exposing
students to all major paradigms through the use of a multiparadigmatic language, and
does not attempt to identify” the “correct paradigm” [60].
The document “A Formal Language Selection Process” [61] has a design of choice
with a weighted multicriteria method and where evaluation criteria are identified such
as Reasonable Financial Cost, Academic/Student Version Availability, Academic
Acceptance, Textbook Availability, Lifecycle Stadium, Industry Acceptance, Market-

ing (regional and national), Student/Academic/Full System Requirements, Operating
System Dependency, Proprietary/Open Source, Development Environment, Debugging
Facilities, Fundamentals Learning Ease, Secure Code, Advanced Course Features
Subsequent, More or Less Complicated Programming, Web Development Support,
Teaching Support, Object Oriented Support, Support Availability, Instructor and Staff
Teaching, and Expected Level of New Students.
Mannilla e de Raadt [30] compare multiple languages in which various
inclusion/exclusion criteria are used such as Be suitable for teaching, Be interactive and
fast, Promotes correct writing, Allows you to program in “small”, Provides a contin-
uous development environment, Good user community, Open source, good support, be
free, have good teaching material, not only be used for educational purposes only, be
reliable and efficient.
Several attempts have been made in the past to sort programming languages [41].
There are numerous comparisons between the most commonly used languages: like
Python vs C++ [62], Python vs C [63], Java vs Python [64], C++ vs. Java [65]. Any of
the three/four most commonly used programming languages is free, well supported and
has a large user community, is reliable and efficient.
Ease of learning can be discussed: C will have a more complicated syntax than
Pyhton. The major differences are the use of pointers (C only), parameter passing by
reference and value (C only), programming paradigm (procedural in C, object oriented
in others), being compiled or interpreted (C and Pyhton/Java respectively).
5 Conclusions
A programming language is used to materialize the solution of a problem. A program

should only be written after finding the best solution.
There are numerous programming languages that are adopted for the sake of
evolution, purpose of use or even personal taste.
The choice of which programming language to choose for introductory teaching
must accompany evolution, but because it has a propaedeutic character, the choice must
meet several requirements, namely pedagogical, and acceptance from the outside
world.
As future work we will compare the three programming languages currently used
for CS1 curricular units: compare the simplicity, the IDE, the debugger and other
features that have been identified in this article.
There isn’t, and probably never will be, consensus as to which language should be
chosen to introduce the student in the world of computer science.
The first programming language of a future computer science professional is just
the beginning of a long walk.
172 S. R. Sobral
References
1. Knuth, D.: The Art of Computer Programming. Addison-Wesley, Reading (1968)
2. Gries, D.: The Science of Programming, Springer, New York (1981)
3. Dijkstra, E.W.: A Discipline of Programming. Prentice Hall, Englewood Cliffs (1976)
4. Aho, A., Ullman, J.D.: Foundations of Computer Science. Principles of Computer Science
Series, C edn. Freeman, W. H. (1994)
5. Sobral, S.R.: B-learning em disciplinas introdutórias de programação. Universidade do
Minho, Guimarães (2008)
6. McCracken, M., Almstrum, V., Diaz, D., Guzdial, M., Hagan, D., Kolikant, Y.B.-D., Laxer,
C., Thomas, L., Utting, I., Wilusz, T.: A multi-national, multi-institutional study of
assessment of programming skills of first-year CS students. In: ITiCSE on Innovation and
Technology in Computer Science Education (2001)
7. Sobral, S.R., Pimenta, P.: O ensino da programação: exercitar a distancia para combate às
dificuldades. In: 4ª Conferência Ibérica de Sistemas e Tecnologias de Informação (2009)
8. Lima, J.R.: Programação de computadores, Porto Editora, Porto (1991)
9. Bergin, S., Reilly, R.: Programming: factors that influence SuccessSusan. In: Proceedings of
the 36th SIGCSE Technical Symposium on Computer Science Education (2005)
10. Becker, B.A., Fitzpatrick, T.: What do CS1 syllabi reveal about our expectations of
introductory programming students? In: 50th ACM Technical Symposium on Computer
Science Education (2019)
11. Luxton-Reilly, A.: Learning to program is easy. In: ACM Conference on Innovation and
Technology in Computer Science Education (2016)
12. Mitchell, J.C.: Concepts in Programming Languages. Cambridge University Press,
Cambridge (2003)
13. Sprankle, M.: Problem Solving and Programming Concepts, 9 edn. Pearson, London (2011)
14. Scott, M.L.: Programming Language Pragmatics, 3rd edn. Elsevier, Amsterdam (2009)
15. Stackoverflow.com: Stackoverflow (2019). https://insights.stackoverflow.com/survey/2019
16. TIOBE Software BV: TIOBE, Set (2019). https://www.tiobe.com/tiobe-index/
17. Wikipedia: Programming languages used in most popular websites, Setembro (2019). https://
en.wikipedia.org/wiki/Programming_languages_used_in_most_popular_websites
18. Rice, J.R., Rosen, S.: History of the Computer Sciences Department at Purdue University.
Department of Computer Science, Purdue University (1990)
19. Atchison, W.F., Conte, S.D., Hamblen, J.W., Hull, T.E., Keenan, T.A., Kehl, W.B.,
McCluskey, E.J., Navarro, S.O., Rheinboldt, W.C., Schweppe, E.J., Viavant, W., Young Jr.,
D.M.: Curriculum 68: recommendations for academic programs in computer science: a
report of the ACM curriculum committee on computer science. Commun. ACM 11(3), 151–
197 (1968)
20. Austing, R.H., Barnes, B.H., Bonnette, D.T., Engel, G.L., Stokes, G.: Curriculum’78:
recommendations for the undergraduate program in computer science—a report of the ACM
curriculum committee on computer science. Commun. ACM 22(3), 147–166 (1979)
21. Koffman, E.B., Miller, P.L., Wardle, C.E.: Recommended curriculum for CS1, 1984.
Commun. ACM 27(10), 998–1001 (1984)
22. Tucker, A.B., ACM/IEEE-CS Joint Curriculum Task Force: Computing curricula 1991:
report of the ACM/IEEE-CS Joint Curriculum Task Force, p. 154. ACM Press (1990)
23. The Joint Task Force IEEE and ACM: CC2001 Computer Science, Final Report (2001)
24. Cassel, L., Clements, A., Davies, G., Guzdial, M., McCauley, R.: Computer Science
Curriculum 2008: An Interim Revision of CS 2001. ACM (2008)
25. Task force ACM e IEEE: Computer Science Curricula 2013. ACM and the IEEE Computer
Society (2013)
26. Smith, C., Rickman, J.: Selecting languages for pedagogical tools in the computer science
curriculum. In: Proceedings of the Sixth SIGCSE Technical Symposium on Computer
Science Education (1976)
27. Wexelblat, R.L.: First programming language: consequences (panel discussion) (1979)
28. Tharp, A.L.: Selecting the “right” programming language. In: SIGCSE 1982 Technical
Symposium on Computer Science Education, Indianapolis, Indiana, USA (1982
29. Duke, R., Salzman, E., Burmeister, J., Poon, J., Murray, L.: Teaching programming to
beginners - choosing the language is just the first step. In: ACSE 2000 Proceedings of the
Australasian Conference on Computing Education (2000)
30. Mannila, L., de Raadt, M.: An objective comparison of languages for teaching introductory
programming. In: 6th Baltic Sea Conference on Computing Education Research: Koli
Calling 2006 (2006)
31. Giangrande Jr., E.: CS1 programming language options. J. Comput. Sci. Coll. 22(3), 153–
160 (2007)
32. Kemeny, J.G., Kurtz, T.E: BASIC - A Manual for BASIC, the Elementary Algebraic
Language. Dartmouth College (1964)
33. Parker, K., Davey, B.: The history of computer language selection. In: IFIP Advances in
Information and Communication Technology, pp. 166–179 (2012)
34. Dijkstra, E.W.: Go to statement considered harmful. Commun. ACM 11(3), 147–148 (1968)
35. Knuth, D.: Structured programming with go to statements. Comput. Surv. 6(4), 261–301
(1974)
36. Dahl, O., Dijkstra, E., Hoare, C.: Structured Programming. Academic Press Ltd., London
(1972)
37. Wirth, N.: The programming language pascal. In: Pioneers and Their Contributions to
Software Engineering. Springer (1971)
38. Gupta, D.: What is a good first programming language? Crossroads ACM Mag. Stud. 10(4),
7 (2004)
39. Levy, S.: Computer language usage in CS1: survey results. ACM SIGCSE Bull. 7(3), 21–26
(1995)
40. McCauley, R., Manaris, B.: Computer science degree programs: what do they look like? A
report on the annual survey of accredited programs. ACM SIGCSE Bull. 30(1), 15–19
(1998)
41. Farooq, M.S., Khan, S.A., Ahmad, F., Islam, S., Abid, A.: An evaluation framework and
comparative analysis of the widely used first programming languages. PLoS ONE (2014)
42. Sobral, S.R.: 30 years of CS1: programming languages evolution. In: 12th Annual
International Conference of Education, Research and Innovation (2019)
43. Siegfried, R.M., Siegfried, J., Alexandro, G.: A longitudinal analysis of the Reid list of first.
Inf. Syst. Educ. J. 10(4), 47–54 (2016)
44. Sobral, S.R.: Bachelor’s and master’s degrees integrated in Portugal in the area of
computing: a global vision with emphasis on programming UCS and programming
languages used. In: 11th Annual International Conference of Education, Research and
Innovation (2018)
45. Sobral, S.R.: Introduction to programming: portrait of higher education in computer science
in Portugal. In: 11th International Conference on Education and New Learning Technologies
(2019)
46. Murphy, E., Crick, T., Davenport, J.H.: An analysis of introductory programming courses at
UK universities. Art Sci. Eng. Programm. 1(2) (2017)
174 S. R. Sobral
47. Ezenwoye, O.: What language? - the choice of an introductory programming language. In:
48th Frontiers in Education Conference, FIE 2018 (2018)
48. Aleksić, V., Ivanović, M.: Introductory programming subject in European higher education.
Inform. Educ. 15(2), 163–182 (2016)
49. Roberts, E.: The dream of a common language: the search for simplicity and stability in
computer science education. In: 35th SIGCSE Technical Symposium on Computer Science
Education (2004)
50. Dijkstra, E.W.: The humble programmer. Commun. ACM 15(10), 859–866 (1972)
51. Laakso, M., Kaila, E., Rajala, T., Salakoski, T.: Define and visualize your first programming
language. In: 8th IEEE International Conference on Advanced Learning (2008)
52. Stefik, A., Hanenberg, S.: The programming language wars: questions and responsibilities
for the programming language community. In: 2014 ACM International Symposium on New
Ideas, New Paradigms, and Reflections on Programming & Software (2014)
53. Goosen, L.: A brief history of choosing first programming languages. In: History of
Computing and Education 3 (2008)
54. Guerreiro, P.: A mesma velha questão: como ensinar Programação? In: Quinto Congreso
Iberoamericano de Educación Superior (1986)
55. Collberg, C.S.: Data structures, algorithms, and software engineering. In: Software
Engineering Education - SEI Conference 1989 (1989)
56. Bruce, K., Freund, S.N., Harper, R., Larus, J., Leavens, G.: What a programming languages
curriculum should include. In: SIGPLAN Workshop on Undergraduate Programming
Language Curricula (2008)
57. King, K.N.: The evolution of the programming languages course. ACM SIGCSE Bull. 24
(1), 213–219 (1992)
58. Howatt, J.: A project-based approach to programming language evaluation. ACM SIGPLAN
Not. 30(7), 37–40 (1995)
59. Luker, P.A.: Never mind the language, what about the paradigm? In: Twentieth SIGCSE
Technical Symposium on Computer Science Education (1989)
60. Budd, T.A., Pandey, R.K.: Never mind the paradigm, what about multiparadigm languages?
ACM SIGCSE Bull. 27(2), 25–30 (1995)
61. Parker, K.R., Chao, J.T., Ottaway, T.A., Chang, J.: A formal language selection process.
J. Inf. Technol. Educ. 5(1), 133–151 (2006)
62. Alzahrani, N., Vahid, F., Edgcomb, A., Nguyen, K., Lysecky, R.: Python versus C++: an
analysis of student struggle on small coding exercises in introductory programming courses.
In: 49th ACM Technical Symposium on Computer Science Education (2018)
63. Wainer, J., Xavier, E.: A controlled experiment on Python vs C for an introductory
programming course: students’ outcomes. ACM Trans. Comput. Educ. 18(3), 1–16 (2018)
64. McMaster, K., Sambasivam, S., Rague, R., Wolthuis, S.: Java vs. Python coverage of
introductory programming concepts: a textbook analysis. Inf. Syst. Educ. J. 15(3), 4–13 (2017)
65. Farag, W., Ali, S., Deb, D.: Does language choice influence the effectiveness of online
introductory programming courses? In: 14th Annual ACM SIGITE Conference on
Information Technology Education (2013)
66. Koffman, E.B., Stemple, D., Wardle, C.E.: Recommended curriculum for CS2, 1984: a
report of the ACM curriculum task force for CS2. Commun. ACM 28(8), 815–818 (1985)
67. The Joint Task Force for Computing Curricula 2005: Computing Curricula 2005: The
Overview Report. ACM (2005)
68. The Joint Task Force on Computing Curricula: Curriculum Guidelines for Undergraduate
Degree Programs in Software Engineering. ACM (2004)
69. The Joint Task Force on Computing Curricula: SE2004: Curriculum Guidelines for
Undergraduate Degree Programs in Software Engineering. ACM (2004)
Design of a Network Learning System
for the Usage of Surgical Instruments
Ting-Kai Hwang1, Bih-Huang Jin2(&), and Su-Chiu Wang3

1
Department of Journalism, Ming Chuan University, Taipei, Taiwan
[email protected]
2
Department of Business Administration, Tunghai University,
Taichung City, Taiwan
[email protected]
3
Nursing Department, Taichung Veterans General Hospital,
Taichung City, Taiwan
Abstract. To improve nursing clinical staff training, this study combines the
methods of e-learning with situated simulation and follows the six steps of the
design science research process to construct a surgical instruments network
learning system. Based on lean management concepts to restructure the number
of surgical instruments and establish basic equipment package, it eliminates the
excess wastes of production, transportation, action and excessive treatment.
Then, the surgical instrument images are linked to the relational database to
establish the network learning system. To evaluate the effectiveness of the
system, the learning outcomes and the satisfaction investigation of subjective
learning were measured. The results show that it can improve the learning
effectiveness of operating room new nursing staff, and enhance professional
knowledge.
Keywords: Surgical instrument Network learning Learning effectiveness
1 Introduction
With the development of information technology and the popularity of the network
environment, many companies use e-learning technology and emphasize learner-
centered teaching to reflect the diversified way of staff training. In addition, the
innovated teaching mode of situated simulation is adopted to speed up the staff
members’ competency development. Bertoncelj pointed out that competency refers to
the required capability in a specific position of the workplace, which includes personal
knowledge, technology, ability and attitude [1]. So that, it can successfully complete
the responsibility and performance expectations of the job. Especially in medical
industry, the relevant personnel are required to have a high degree of expertise.
Whether the competency is fully functional, it is very important for hospital man-
agement. In addition, nursing staff is an important human resource for the hospital.
Competency management of nursing staff therefore becomes one of the major tasks for
medical institution operators. With the innovation of surgical instruments and in the
face of complicated equipment and procedure during the surgical operations, how to
176 T.-K. Hwang et al.
properly operate the instrument and get use of its main functions is the learning focus
for scrubbing nurses. Wherein the correctness of the instrument preparation is more
directly affecting operation process and safety of patients. For new nursing staff as
scrubbing nurses, they often have a lot of pressure. Especially, when they are not
familiar with surgical instruments, pass the instrument incorrectly or do not have
sufficient capacity preparation, it often induces suffering from the blame and complaint
of the surgeon and affects the teamwork atmosphere. The surgery may even be forced
to extend the operation time. Thus new nursing staff in the operating room withstands a
lot of frustration, and further loses the intention to continue working at the position.
This study adopts design science approach to develop a surgical instrument network
learning system, which includes empathy to identify problems, defining the objectives
of the solution, design and development, prototype display, test evaluation, communi-
cation and correction. Based on in-depth interviews and comments in symposium, the
needs are collected. Then, we modularize surgical instruments and build the database.
Through the situated simulation of interactive learning, the new nursing staff in the
operating room can learn effectively in the most saving way. This can strengthen the
cultivation of nursing talents, enhance learning effectiveness and working efficiency,
and maintain the quality of the surgery that the patient should obtain.
2 Literature
To achieve the lean management of surgical instruments, this study adopts design
science approach to develop a network learning system. Therefore, we reviewed the
related literature of design science approach, network learning theory and lean man-
agement as follows
2.1 Design Science Approach

The main task of design science research is to identify design problems, develop
solutions and evaluate them [6]. In recent years, many scholars in the field of infor-
mation technology began to advocate design science and put forward the method to
verify the validity of design science and its research value. Design science therefore
becomes a model for information technology research. However, due to the lack of well
established study process architecture, it caused a hindrance in promotion. Peffers et al.
described that there are six steps for design science approach, including 1. identifying
problems 2. defining the objectives of the solution 3. design and development 4.
prototype display 5. test evaluation 6. communication and correction [8]. This
methodlogy combines principles, practices and frameworks. Also, it meets the fol-
lowing three objectives, 1. fitting the proposed theory in the literature, 2. providing
standard processes for design science research, 3. providing the mental models of
design science research performance and evaluation.
2.2 Network Learning Theory

The assistance of network learning (or e-learning) has been generally recognized.
Rosenberg defined network learning as using the internet to transfer various solutions
Design of a Network Learning System 177
that can strengthen the knowledge and improve performance [9]. The research of
Bryant et al. showed that the effect of e-learning is better than traditional learning [2].
Pedaste et al. indicated that in the web-based virtual learning environment, users can
explore an actual learning process [7]. Through appropriate learning arrangement, users
can even cultivate the ability of critical thinking and problem solving. Many studies
explore the willingness and behavior of using e-learning. The results show that easier
use of the learning system, easy to observe the use of others and having the opportunity
to learn and try first will help to enhance the willingness to use [3, 4].
2.3 Lean Management

Womack et al. defined lean production including five steps: defining customer value,
defining value stream, establishing an uninterrupted workflow, pul production and the
pursuit of perfect [10]. The most important thing is creating a corporate culture in
which everyone pursues continuous improvement. Taiichi Ohno of Toyota company
explained that Toyota pays attention to the operation time from the customer order to
the collection payment of the customer. The operation time is shortened by removing
the wastes without value creation, which includes seven waste items mentioned by
Taiichi Ohno and the eighth waste item complemented by Liker [5]. The waste items in
medical instititions are described as follows.
1. Overproduction: The instruments are more than needed. The staff has to spend extra
hours to clean and disinfect the returned and unsed surgical instruments.
2. Waiting: The patient is waiting for the next process. For example, due to the nursing
staff is not familiar with surgical instruments or passes the instrument incorrectly,
the surgery operation time is forced to extend.
3. Motion: The patient or instruments must be moved from one place to another for a
specific process, and then moved back to the origin. For example, when the
equipment is insufficient, the patient has to be moved to another operating room.
4. Extra-processing: It means there are some processes are not required.
5. Inventory: The excessive inventory of medical items will result in expired waste.
Space can also be thought of as the concept of inventory. To accommodate patients
waiting for surgery, enough space in the waiting room is required.
6. Transportation: It means the medical personnel do the unnecessary move.
7. Defects: This refers to the waste that has to be re-applied if the treatment is not
processed well at the first time.
8. Non-utilized talent: Due to without the participation of the staff or ignore staff
opinions, it causes the failure to making good use of staff time, ideas and skill. The
staff therefore loses improving and learning opportunities.
The wastes described above induce the extension of lead time (waiting time and
operation time). Therefore, eliminating waste and improving the efficiency of the
surgical instrument usage, patients’ waiting time for surgery can be further reduced.
And continuous improvement activities make instrument management more perfect.
3.1 Research Procedure
This study adopts design science approach. The procedure includes six steps as
follows.
1. Empathy to identify problems
The research problems should be clearly defined first with the illustration of the
solution value. In the current related research on surgical instruments, most of the
focus is on the establishment of instrument management standards and sterilization
tracking systems. There is less research exploring the instrument learning needs for
the operation room scrubbing nurses. While in the surgery, scrubbing nurses often
face the lack of instrument-related education. There is no unified artifact for nursing
staff to learn.
2. Define the objectives of the solution
Following the defined problem, the relevant knowledge is applied to assess the
availability of the solution and infer the goal. The goal of this study is to develop a
surgical instrument network learning system. Therefore, it is needed to establish the
correspondence between the name of surgical instrument and the name of surgical
operation. In addition, the data required for subsequent system development should
also be considered.
3. Design and development
At this stage, the expected function of the output artifact and its conceptual
framework must be determined. Then, the actual output artifact is constructed. This
study integrates the existing research concepts and methods to provide a surgical
instrument network learning system suitable for operating room nursing staff.
4. Prototype display
At this stage, it is necessary to show how to use the output artifact of the third step
to solve the problems on learning.
5. Test evaluation
The objectives of the solution defined in the second step must be compared with the
system produced in the fourth step. It observes and measures whether the output
proposed in the third step can help to solve the problem well. At the end of this
stage, researchers can decide whether to go back to the third step to enhance output
efficiency and give the subsequent improvement suggestions.
This study selected the system learning satisfaction and self-learning performance
satisfaction as the evaluation criterion to measure whether it is feasible or not.
Finally, there are discussions and explanations about the method and implemen-
tation process of the research with the informatics nurses who are responsible for
the system development.
6. Communication and correction
After the research proposed, communication about the research issues, its impor-
tance, practicality and novelty of output, rigor and efficiency of design, etc. with
other related people or professionals is essential. The design prototype therefore
gets improvement.
3.2 Research Structure

The development of the research structure is based on the research motivation and
purpose as shown in Fig. 1.
1. Lean instrument management process: The optimization of surgical instrument
management process is proceeded.
2. Establishing surgical instruments network learning system: Through needs confir-
mation, the effective learning method is obtained. Then, network learning system
development and learning effectiveness evaluation are conducted.
Fig. 1. Research structure
4 Design of Surgical Instrument Network Learning System
In accordance to the six steps of design science and the structure of this study, design
and development of surgical instruments network learning system are described as
follows.
4.1 Needs Survey of Surgical Instrument Network Learning System

Design
The design and development concept of this system is based on the learning needs of
new staff in the operating room. The teaching suggestions of nursing preceptors are
included as well. The in-depth interviews with semi-structured questionnaires are
conducted. In addition, the qualitative data in the operating room new staff symposium
of a medical center and questionnaire survey with open questions on the working
environment adaptation of new staff are collected for two years. With applied empathy,
the problem is identified and the needs survey is made to obtain the interview results of
nursing preceptors and new nursing staff, respectively.
4.2 Network Learning System Structure and Database Establishment

The surgical instrument network learning system is developed using JAVA program-
ming language and Oracle database system. As shown in Fig. 2, when the learner login
into the web page, there are two categories, basic instruments and department instru-
ments, respectively. Department instruments can be divided according to the surgical
specialty. After the selection of the surgical specialty, the learner can then choose the
instrument area, teaching video area, test area or satisfaction area. For the instrument
area, there are instrument pictures of the whole surgical instrument package. The
learner can first test whether he/she knows the name of the instrument and its function.
Then, when the mouse pointer is moved to the picture, it will display name and
function of the instrument. To watch the video of important surgery process, the learner
can click on the teaching video area. If the learner chooses test area, it will play a video
simulating surgical scenario, and then ask the learner to choose the suitable instrument.
The score can be queried after the test. For the satisfaction area, learners can fill in the
satisfaction questionnaire.
Fig. 2. Website structure
4.3 Effectiveness Evaluation of the Surgical Instrument Learning

This study conducted a satisfaction survey on the web content of the surgical instru-
ment network learning system. Through the validity evaluations of two nursing
directors and three attending surgeons, the self-made questionnaire contains ten sat-
isfaction related questions, including four questions of web design content, three
questions of online web learning, two questions of self-learning effectiveness and one
question of overall appeal. The scoring is based on five-point Likert scale. From the
answers of 64 surgical operation room new nursing staff with employment less than six
months, the preliminary analysis shows the satisfactions are up to 100% on web design
content, online web learning and overall appeal. The satisfaction is also up to 99.2% on
self-learning effectiveness. It indicates the established system meets the learning pattern
of new nursing staff.
5 Conclusions
The purpose of this study is to design and develop a surgical instrument network
learning system to optimize the surgical device management process and improve the
satisfaction of surgical instrument network learning. The system design is divided into
four blocks, which are instrument area, teaching video area, test area and satisfaction
survey area. The survey results show that the study objects are satisfied with the
enhancement of the instrument knowledge and ability to apply what they have learned
to the work. There is also a positive correlation between conscious learning effec-
tiveness and overall appeal. Basically, the attitudes of the network learning system
users are affected by the subjective perception on the system. Therefore, when planning
the network learning system, the simple and clear design is important. It should include
interface with clear guidance to increase the ease of use.
References
1. Bertoncelj, A.: Manager’s competencies framework: a study of conative component.
Ekonomska Istrazivanja 23(4), 91–101 (2010)
2. Bryant, K., Campbell, J., Kerr, D.: Impact of web based flexible learning on academic
performance in information systems. J. Inf. Syst. Educ. 14, 41–50 (2003)
3. Hsu, C.-N.: Combining innovation diffusion theory with technology acceptance model to
investigate business employees’ behavioral intentions to use e-learning system. Master’s
thesis. National Central University, Taoyuan City, Taiwan (2010)
4. Hsu, S.-C., Liu, C.-F., Weng, R.-H., Chen, C.-J.: Factors influencing nurses’ intentions
toward the use of mobile electronic medical records. Comput. Inform. Nurs. 31, 124–132
(2013)
5. Liker, J.K.: The Toyota Way: 14 Management Principles from the World’s Greatest
Manufacturer. McGraw Hill, New York (2004)
6. March, S., Smith, G.: Design and natural science research on information technology. Decis.
Support Syst. 15, 251–266 (1995)
7. Pedaste, M., Sarapuu, T.: Developing an effective support system for inquiry learning in a
Web-based environment. J. Comput. Assist. Learn. 22(1), 47–62 (2006)
8. Peffers, K., Tuunanen, T., Rothenberger, M.A., Chatterjee, S.: A design science research
methodology for information systems research. J. Manag. Inf. Syst. 24(3), 45–77 (2007)
9. Rosenberg, M.J.: E-learning: Strategies for Delivering Knowledge in the Digital Age.
McGraw-Hill, New York (2001)
10. Womack, J.P., Jones, D.T.: Banish Waste and Create Wealth in your Corporation. Free
Press, New York (2003)
CS1 and CS2 Curriculum Recommendations:
Learning from the Past to Try
not to Rediscover the Wheel Again
Sónia Rolland Sobral(&)
REMIT, Universidade Portucalense, Porto, Portugal

[email protected]
Abstract. Initial programming curricular units are of great importance to com-

puter courses. There has been very important work with curriculum recommen-
dations, notably those from Association for Computing Machinery (ACM) and
later ACM in conjunction with the Institute of Electrical and Electronics Engineers
(IEEE): so far almost twenty curriculum recommendations have been published.
Computing is a constantly evolving area, as is society and the way new
generations learn. Why are so many recommendations needed? And what are
the developments in these recommendations?
This article lists initial course units that are suggested or used as examples in
each of the curriculum recommendation reports, both initial and generic, and
those that address a specific area, namely Computer Engineering (CE), Com-
puter Science (CS), Information Systems (IS), Information Technology (IT), and
Software Engineering (SE).
This study is of great importance for those who have the responsibility to
design and redesign curricula as it points out a number of different paths, namely
in relation to the already mentioned distinction by areas but also the distinction
that is made by university size, previous knowledge of the students, and also
duration of studies, among other variables. Knowing history makes it possible to
understand the present and even make better choices for the future.
Keywords: CS1 CS2 Curriculum recommendations
1 Introduction
The importance of the initial programming curriculum units has a great importance to
the academic life and professional future of a computer science student. The content of
these units, the objectives, the programming languages, and the way everything is
learned and taught needs a lot of thought to be successful.
Since the 1960s, efforts have been made to make curriculum recommendations for
universities around the world: the importance of ACM and later Association for Com-
puting Machinery (ACM) in conjunction with the Institute of Electrical and Electronics
Engineers (IEEE), it is a huge relevance to computer science field. The lessons of those
studies should be taken, even when they seem to be outdated or no longer making sense.
Of course, these documents are closely linked to the emergence of new programming
languages and paradigms, but they all have in common the need to give directions to the
CS1 and CS2 Curriculum Recommendations 183
course directors and teachers of these curricular units. It is very important for the evo-
lution to be made but even more important is not to follow the trends and try to affirm it
just for the sake of modernity or because another university has made that change.
At the beginning of this century, there was made a distinction between Computer
Engineering (CE), Computer Science (CS), Information Systems (IS), Information
Technology (IT), and Software Engineering (SE). In this way the new curriculum
reports have come to focus on a specific area, and there is a departure from the core and
elective areas by each of the specific cases.
This article gives a description of each of the curriculum recommendation docu-
ments and identifies key points for the initial programming units. A reflection is made
of the way that has been travelled, the present and the clues for the next years.
2 The Curriculum Recommendations Reports
The first curriculum studies for undergraduate studies in Computer Science appeared in
March 1968, when the Association for Computing Machinery (ACM) published an
innovative and necessary document, Curriculum 68: Recommendations for academic
programs in computer science [1], with early indications of curriculum models for
programs in computer science and computer engineering.
With the emergence of many new courses and departments, ACM published a new
report, Curriculum’78: recommendations for the undergraduate program in computer
science [2], updating Curriculum 68.
Despite the importance of Curriculum’78 there has been much discussion, partic-
ularly regarding the sequence CS1 and CS21. In 1984 a new report is published:
“Recommended curriculum for CS1, 1984” [3] and in 1985 appear a new document
“Recommended curriculum for CS2, 1984” [4].
In 1991 [5] IEEE (Institute of Electrical and Electronics Engineers) and ACM
joined for a new document (this report does not contain a single prescription of courses
for all undergraduate programs, but a collection of subject matter modules called
knowledge units).
In 2005, the Computing Curriculum 2005: The Overview Report [6], covering
undergraduate degree programs in Computer Engineering, Computer Science, Infor-
mation Systems, Information Technology, and Software Engineering; provides under-
graduate curriculum guidelines for five defined sub-disciplines of computing:
(1) Computer Engineering (Curriculum Guidelines for Undergraduate Degree Pro-
grams in Computer Engineering 2004 ([7] and then 2016 [8]).
(2) Computer Science (Computing Curricula 2001 [9], then in 2008 [10] and 2013 [11])
(3) Information Systems (Association for Computing Machinery (ACM), Association
for Information Systems (AIS) and Association of Information Technology
1
The terms CS1 and CS2 are used since 1978 [2] to designate the first two courses in the introductory
sequence of a computer science. Introduction to programming courses as CS1 and basic data
structures courses as CS2. Or Computer Programming I (CS1) as the initial unit, prerequisite for
Computer Programming II (CS2).
184 S. R. Sobral
Professionals (AITP) published Model Curriculum and Guidelines for Under-

graduate Degree Programs in Information Systems 1997 [12], 2002 [13] and
2010 [14].
(4) Information Technology (Curriculum Guidelines for Undergraduate Degree Pro-
grams in Information Technology 2008 [15] and 2017 [16])
(5) Software Engineering (Curriculum Guidelines for Undergraduate Degree Pro-
grams in Software Engineering 2004 [17] and 2014 [18]).
Since Curriculum’68 [1] there have been 6 new general reports and 2 or 3 more for
each of the five specific areas defined in 2005 by the CC2005. Why are there so many
recommendations reports? The massification of the World Wide Web, laptops, mobile
phones, object-oriented paradigms, the security aim, among others. But also pedagog-
ical issues such as computer-mediated distance learning or the new world of collabo-
rative tools. These changes require curriculum changes: because everything changes.
3 The Freshman Computer Science Curricular Units
3.1 From 1968 Until the Division by Areas

Curriculum1968 [1] defines a “classification of subject areas contained in computer
science and twenty-two courses in these areas.” Prerequisites, catalog descriptions,
detailed sketches, and annotated bibliographies were included for each of these courses.
The course has an initial structure as you can see in the following Fig. 1. Two courses
are proposed for a first year: B1. Introduction to Computing and B2. Computers and
Fig. 1. Core courses of the proposed undergraduate program [1]

Programming. Both with 2-2-3, two hours of lectures and two hours of laboratory per
week for a total of three semester hours of credit.
Curriculum’78 [2] introduced a new core structure, as shown in the following
Fig. 2. Computer Programming I (CS1) is the initial unit, prerequisite for Computer
Programming II (CS2), which is prerequisite for Introduction to Computer Systems
(CS3), Introduction to Computer Organization (CS4), and Introduction to File Pro-
cessing (CS5). CS1 and CS2 have the same model as the initial course units of the
previous report: 2-2-3.
Fig. 2. Computer science core curriculum [2]
Curriculum’91 [5] has made a change from previous recommendations: this report
does not contain a single prescription of courses for all undergraduate programs. Each
knowledge unit corresponds to a topic that must be covered at some point during the
undergraduate curriculum. It contains “a set of curricular and pedagogical considera-
tions that govern the mapping of the common requirements and advanced/supplemental
material into a complete undergraduate degree program” and a collection of subject
matter modules called knowledge units that comprise the common requirements for all
undergraduate programs. Each individual institutions have flexibility to assemble the
knowledge units into course structures that fit their particular needs. The reason given
is that “a curriculum for a particular program depends on many factors, such as the
purpose of the program, the strengths of the faculty, the backgrounds and goals of the
students, instructional support resources, infrastructure support and, where desired,
accreditation criteria. Each curriculum will be site-specific, shaped by those responsible
for the program who must consider factors such as institutional goals, opportunities and
constraints, local resources, and the preparation of the students”. The appendix of the
full report contains the 12 sample curricula, showing how the knowledge units can be
combined to form courses and programs (see next Table 1): Implementation A to K: A
Program in Computer Engineering, in Computer Engineering (Breadth-First), in
Computer Engineering (Minimal Number of Credit-Hours), in Computer Science, in
Computer Science (BreadthFirst), in Computer Science (Theoretical Emphasis), in
Computer Science (Software Engineering Emphasis), a Liberal Arts Program in
Computer Science (Breadth-First), a Program in Computer Science and Engineering, a
186 S. R. Sobral
Liberal Arts Program in Computer Science, a Liberal Arts Program in Computer

Science (Breadth-First) and a Program in Computer Science (Theoretical Emphasis).
The first nine sample curricula have “as a common goal the preparation of graduates for
entry into the computing profession” and the last tree have other goals than the
preparation of graduates for entry into the profession, like “preparation for a lifetime of
learning, breadth of education or preparation for graduate study”.
Table 1. Freshman Year, CS, each of 12 sample curricula.

1st Semester 2nd Semester
A Prog Computer Engineering Introduction to Introduction to
Computing I Computing II
B Prog Computer Engineering (Breadth- Prob Solving, Abstraction, Data Struct
First) Programs & & Lg Softw Syst
Computers
C Prog Computer Engineering (Minimal Intro to Prob Solv Intro to Software Engr
Number of Credit-Hours) w/Comput
D Prog Computer Science Introduction to Introduction to
Computing I Computing II
E Prog Computer Science (BreadthFirst) Prob Solving, Abstraction, Data Struct
Programs & & Lg Softw Syst
Computers
F Prog Computer Science (Theoretical Computing I Computing II
Emphasis)
G Prog Computer Science (Software Introduction Software Software Methodology
Engineering Emphasis) to Engineering
H LA Computer Science (Breadth-First) Prob Solving, Abstraction, Data Struct
Prog Programs & & Lg Softw Syst
Computers
I Prog Computer Science and Introduction to Introduction to
Engineering Computing I Computing II
J LA Computer Science Fundamentals of Fundamentals of
Prog Computing I Computing II
K LA Computer Science (Breadth-First) Prob Solving, Abstraction, Data Struct
Prog Programs & & Lg Softw Syst
Computers
L Prog Computer Science (Theoretical Introduction to CS I Introduction to CS II
Emphasis)
3.2 Computer Engineering

CECurriculum2004 (Curriculum Guidelines for Undergraduate Degree Programs in
Computer Engineering 2004) [7]) defined four different example: Computer Science
Department, Electrical & Computer Engineering Department, Joint - Computer Science
and Electrical Engineering Departments and United Kingdom; each one have different
curricular units to the first year, as we can see in the next Table 2.
Table 2. Computer Engineering, first year curricular units, CS, 2004.

A Computer Science Department CSCA101 Computer CSCA102 Computer
Science I Science II
B Electrical & Computer Engineering CSCB101 CSCB102
Department Programming & Prob. Programming & Prob.
Solving I Solving II
C Joint - Computer Science and CSCC102 CSCC103 MTH 101
Electrical Engineering Departments Programming I Programming II
D United Kingdom SWED101 SWED102
Programming Basics Programming
Fundamentals
CECurriculum2016 (Joint Task Group on Computer Engineering Curricula [8])

defined five different model: Four-Year Model, Administered by Computer Science,
Administered jointly by CS and EE, Administered in China and Bologna-3 Model
(Table 3).
Table 3. Computer Engineering, first year curricular units, CS, 2016.

Four-Year Model CSCA101 Introduction to CSCA102 Intermediate
Computer Programming Computer Programming
Administered by CSCB101 Computer Science I CSCB102 Computer Science II
Computer Science
Administered jointly CSCC101 Programming
by CS and EE Fundamentals I
Administered in CSTD 101 Fundamentals of CSTD 201 Fundamentals Object-
China Programming oriented Programming
Bologna-3 Model SWEE101 Programming SWEE102 Programming
Basics Fundamentals
3.3 Computer Science

CC2001 [9] presented six different implementation strategies to Introductory Courses:
Imperative first, Objects first, Functional first, Breadth first, Algorithms first and
Hardware first. It is suggested that the sequence CS1 and CS2 move to three units: 101,
102 and 103 or if it is not feasible to move to sequence 111 and 112 as we can see in
the next Table 4.
188 S. R. Sobral
Table 4. Three and two-course sequences for each implementation strategies, CC2001.
Implemen. Three-course sequences Two-course sequences
strategies
Imperative CS101I. Programming Fundamentals CS111I. Introduction to
first CS102I. The Object-Oriented Paradigm Programming
CS103I. Data Structures and Algorithms CS112I. Data Abstraction
Objects CS101O. Introduction to Object-Oriented CS111O. Object-Oriented
first Programming Programming
CS102O. Objects and Data Abstraction CS112O. Object-Oriented
CS103O. Algorithms and Data Structures Design and Methodology
Functional CS111F. Introduction to Functional
first Programming
CS112F. Objects and Algorithms
Breadth A one-semester course (CS100B) that serves as
first a prerequisite
A preliminary implementation of a breadth-first
introductory sequence (CS101B/102B/103B)
that seeks to accomplish in three semesters what
has proven to be so difficult in two
Algorithms CS111A. Introduction to
first Algorithms and
Applications
CS112A. Programming
Methodology
Hardware CS111H. Introduction to
first the Computer
CS112H. Object-Oriented
Programming Techniques
CS2008 [10] “only” updates CS2001 Body of Knowledge and put additional
commentary/advice in the accompanying text.
The CS2013 report [11] includes examples of courses from a variety of universities
and colleges to illustrate how topics in the Knowledge Areas may be covered and
combined in diverse ways. Has a separate chapter discusses introductory courses, with
identification of some factors to give a set of tradeoffs that must be considered when
trying to decide what should be covered early in a curriculum. Design Dimensions,
Pathways through Introductory Courses, Programming Focus, Programming Paradigm
and Choice of Language, Software Development Practices, Parallel Processing, Plat-
form and Mapping to the Body of Knowledge. Included in examples of initial courses:
CS1101: Introduction to Program Design, WPI (Worcester, MA), COS 126: General
Computer Science (Princeton University, NJ), a background course CS 106A: Pro-
gramming Methodology (Stanford University), and the sequences CS 115 Introduction
to Computer Programming and CS 215 Introduction to Program Design, Abstraction
and Problem Solving (Bluegrass Community and Technical College) and CSCI 134 –
Introduction to Computer Science and CSCI 136 – Data Structures and Advanced
Programming (Williams College).
3.4 Information Systems

Information Systems (Association for Computing Machinery (ACM), Association for
Information Systems (AIS) and Association of Information Technology Professionals
(AITP) published Model Curriculum and Guidelines for Undergraduate Degree Pro-
grams in Information Systems 1997 [12], 2002 [13] and 2010 [14]. ISCurriculum’97
presented IS’97.5 – Programming, Data, File and Object Structures. ISCurriculum2002
presented IS 2002.5 – Programming, Data, File and Object Structures, updating IS’97.5
– Programming, Data, File and Object Structures. IS2010 removed the application
development from the prescribed core: “Application development can still be offered in
most IS programs. By offering application development as an elective the IS 2010
model curriculum increases its reach into nonbusiness IS programs while also creating
flexibility for curricula that choose to include an application development course. The
programs that want to go even further and include a sequence of programming courses
can choose from approaches introduced either in the Computer Science or in the
Information Technology curriculum volumes (CS 2008 or IT 2008, respectively)”.
3.5 Information Technology

Information Technology (Curriculum Guidelines for Undergraduate Degree Programs
in Information Technology 2008 [15] and 2017 [16]) presented one course each Pro-
gramming Fundamentals and ITE-SWF Software Fundamentals, respectively.
3.6 Software Engineering

Software Engineering (Curriculum Guidelines for Undergraduate Degree Programs in
Software Engineering 2004 [7]) presented two model curriculum’s: Start software
engineering in first year and Start software engineering in second year, as we can see in
the next Table 5.
Table 5. Introductory Computing Sequence, start software engineering in first or second year,
SE2004.
A: Start software SE101 Introduction to Software SE102 Software
engineering in first year Engineering and Computing Engineering and
Computing II
B: Introduction to software CS101I Programming CS102I The Object-
engineering in second year Fundamentals Oriented Paradigm
SE2014 [18] presented two examples:

Example 1: Mississippi State University (semester type)
CSE 1284 Introduction to Computer Programming
CSE 1384 Intermediate Computer Programming
190 S. R. Sobral
Example 2: Rose-Hulman Institute of Technology (Trimester type)

CSSE 120 Introduction to Software Development
CSSE 220 Object-Oriented Software Development
CSSE 132 Introduction to Computer Systems Design
4 Conclusion
Since Curriculum’68 there have been 6 new general reports and 2 or 3 more for each of
the five specific areas defined in 2005 by the CC2005: Computer Engineering, Com-
puter Science, Information Systems, Information Technology and Software Engi-
neering. If the initial curriculum included a lot of information about unit content
(including programming languages and even annotated bibliography), the curriculum
by area is no longer so complete.
The first curriculum recommendations were straightforward, while the curriculum
recommendations are modular-type: at this moment it is more important to see
examples of curriculum that can be considered successful and implement them
according to several different items, namely the length of the degrees, the students’
knowledge from secondary education and if the university can impose prerequisites. In
this way each university adopts the recommendations, fitting them in their own
realities.
The objectives of the course (and the intended career opportunities for under-
graduate students) are crucial for curriculum design. In the case of this article, and
looking only at the initial programming courses, we see the differences of each different
area: IS2010 model curriculum removed the application development from the pre-
scribed core. IT2008 and IT2017 presented one course each.
In this context, and after listing the introductory programming curricular units for
the first year of undergraduate degrees, it would be interesting to list the current
curricula of the best universities in the world and to see what is being done on each
continent, each country and each area.
References
1. Atchison, W.F., Conte, S.D., Hamblen, J.W., Hull, T.E., Keenan, T.A., Kehl, W.B.,
McCluskey, E.J., Navarro, S.O., Rheinboldt, W.C., Schweppe, E.J., Viavant, W., Young Jr.,
D.M.: Curriculum 68: recommendations for academic programs in computer science: a
report of the ACM curriculum committee on computer science. Commun. ACM 11(3), 151–
197 (1968)
2. Austing, R.H., Barnes, B.H., Bonnette, D.T., Engel, G.L., Stokes, G.: Curriculum ‘78:
recommendations for the undergraduate program in computer science—a report of the ACM
curriculum committee on computer science. Commun. ACM 22(3), 147–166 (1979)
3. Koffman, E.B., Miller, P.L., Wardle, C.E.: Recommended curriculum for CS1. Commun.
ACM 27(10), 998–1001 (1984)
4. Koffman, E.B., Stemple, D., Wardle, C.E.: Recommended curriculum for CS2, 1984: a
report of the ACM curriculum task force for CS2. Commun. ACM 28(8), 815–818 (1985)
5. Tucker, A.B.: ACM/IEEE-CS Joint Curriculum Task Force. Computing curricula 1991:
report of the ACM/IEEE-CS Joint Curriculum Task Force, p. 154. ACM Press (1990)
6. The Joint Task Force for Computing Curricula 2005, “Computing Curricula 2005: The
Overview Report”. ACM (2005)
7. The Joint Task Force on Computing Curricula, “Curriculum Guidelines for Undergraduate
Degree Programs in Software Engineering”. ACM (2004)
8. Joint Task Group on Computer Engineering Curricula, “CE2016: Computer Engineering
Curricula 2016”. ACM (2016)
9. The Joint Task Force IEEE and ACM, “CC2001 Computer Science, Final Report” (2001)
10. Cassel, L., Clements, A., Davies, G., Guzdial, M., McCauley, R.: Computer Science
Curriculum 2008: An Interim Revision of CS 2001. ACM (2008)
11. Task force ACM e IEEE, “Computer Science Curricula 2013,” ACM and the IEEE
Computer Society (2013)
12. Davis, G.B., Gorgone, J.T., Couger, J.D., Feinstein, D.L., Longenecker Jr., H.E.: Model
Curriculum and Guidelines for Undergraduate Degree Programs in Information Systems.
ACM (1997)
13. Gorgone, J.T., Davis, G.B., Valacich, J.S., Topi, H., Feinstein, D.L., Longenecker Jr., H.E.:
IS2002: Curriculum Guidelines for Undergraduate Degree Programs in Information
Systems. ACM (2002)
14. Topi, H., Valacich, J.S., Wright, R.T., Kaiser, K.M., Nunamaker Jr., J., Sipior, J.C., de
Vreede, G.: IS2010 Curriculum Update: Curriculum Guidelines for Undergraduate Degree
Programs in Information Systems. ACM (2010)
15. Lunt, B.M., Ekstrom, J.J., Gorka, S., Hislop, G., Kamali, R., Lawson, E., LeBlanc, R.,
Miller, J., Reichgelt, H.: IT2008: Computing Curricula Information Technology Volume.
ACM (2008)
16. Task Group on Information Technology Curricula, “IT2017: Curriculum Guidelines for
Baccalaureate Degree Programs in Information Technology”. ACM (2017)
17. The Joint Task Force on Computing Curricula, “SE2004: Curriculum Guidelines for
Undergraduate Degree Programs in Software Engineering”. ACM (2004)
18. Joint Task Force on Computing Curricula, “SE2014: Curriculum Guidelines for Under-
graduate Degree Programs in Software Engineering”. ACM (2014)
19. Koffman, E.B., Stemple, D., Wardle, C.E.: Recommended curriculum for CS2, 1984.
Commun. ACM 28(8), 815–818 (1985)
On the Role of Python
in Programming-Related Courses
for Computer Science and Engineering
Academic Education
Costin Bădică1(B) , Amelia Bădică2 , Mirjana Ivanović3 ,

Ionuţ Dorinel Murareţu1 , Daniela Popescu2 , and Cristinel Ungureanu1
1
Department of Computers and Information Technology, University of Craiova,
Bvd. Decebal 107, Craiova, Romania
[email protected]
2
Faculty of Economics and Business Administration, University of Craiova,
Str. A.I. Cuza 13, Craiova, Romania
3
Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia
Abstract. In this paper we report our approach and experiences

concerning the introduction of Python programming language in
programming-related academic curricula. Firstly we motivate our choice
and approach regarding the use of Python programming language. Then
we discuss the results obtained in two courses that we taught to computer
science and engineering students, both with a strong focus on developing
students’ practical programming skills: Algorithm Design and Artificial
Intelligence. We report our approach and findings, including identified
difficulties and obtained results, as well as proposed future improvements.
Keywords: Computer programming · Python · Computer science and

engineering education
1 Introduction
Computer literacy, including computer programming, have received much
stronger attention in the primary, secondary, and academic education during
the last decade, as compared to previous decades. This trend spans virtually all
the application domains of science and engineering. While in the past thought
as a specific computer science and engineering skill, Computer Programming is
now seen as playing a major role in teaching many academic disciplines with
computation-oriented focus in engineering, as well as in natural and social sci-
ences. At the same time, the list of programming languages has expanded a lot,
with new languages emerging, addressing the recent needs of the developers on
the various devices, platforms, and networks that are in use nowadays.
These contexts and trends raise new challenges for computer science and engi-
neering teachers and educators in designing novel methods and approaches of
On the Role of Python 193
teaching programming-related academic disciplines, supported by appropriately

chosen languages and tools (Python in particular) [12]. Their overall goal is to
better align with the current requirements and challenges set by the augmented
digital environment, as well as to meet the high quality and standards required
by traditional computer science and engineering education [11]. Sharing edu-
cational experiences regarding the use of modern information technologies and
programming languages enables knowledge dissemination and reuse and in our
opinion this is extremely important, taking into account the rapid advances in
these fields [1].
Recent years have witnessed a remarkable growth in the use of Python
in industry and academia. Moreover, this trend can be observed not only in
computer science and engineering education, but also in other computationally-
focused disciplines of science and engineering like for example economics, physics,
and electrical engineering. Consequently, we proceeded by updating curricula of
computer science and engineering education at the University of Craiova, by
including Python topics in various contexts. Similarly, at the University of Novi
Sad, Faculty of Sciences there were introduced several elective courses based on
Python (Data structures and Algorithms, Social Networks Mining). Students
from different study programs can select these courses if they prefer to learn and
use Python programming. As courses were too recently introduced, students still
had no opportunity to select them.
Many related efforts are reported in the literature. An early work is [7] that
presents a comparison of C, MATLAB, and Python for teaching programming
in engineering. Experiences from object-oriented programming courses taught in
three institutions of different countries are presented and compared in [10]. Con-
sidered courses were based on several programming languages including Java
and Python. A closely related approach by combining Python with Logo for
computer science education is introduced in [8]. We noticed also a huge inter-
est in providing students with online interactive learning resources for Python
programming [3,15].
The aim of this paper is to present the approach and experiences concern-
ing the introduction of Python in programming-related computer science and
engineering curricula at the University of Craiova. These experiences could be
useful to other universities/faculties to adopt similar changes. For example at
the Faculty of Sciences, Novi Sad, some further improvements could be initiated.
The paper is structured as follows. In Sect. 2 we motivate the choice and
approach regarding the use of Python programming language. Then we intro-
duce the context of the work that is centered around educational experiences of
using Python programming language for teaching Algorithm Design and Artifi-
cial Intelligence classes at the University of Craiova. The courses are delivered
during the first half of the students’ curricula, in first year second semester and
second year second semester. Both courses have a strong focus on developing stu-
dents’ practical programming skills. In Sect. 3 we introduce our approach and
then we discuss difficulties encountered and results obtained in these courses
194 C. Bădică et al.
respectively. In Sect. 4 we present conclusions and we outline possible paths of

development in the future.
2 Context of the Work

2.1 Python Programming Language
Python [14] is an interpreted multi-paradigm high-level general-purpose pro-
gramming language proposed in the early 90s with the clear emphasis on improv-
ing code readability. With the emergence of Python 2.0 in 2000, interest of the
academic community in Python has rapidly grown, so one can notice many sci-
entific papers and technical reports supporting the suitability of using Python
in computer programming education as first programming language (see early
work [7] and references cited there). On the other hand, there is a lot of debate
in computing education regarding what should be the right choice of the first
and second taught computer programming language [9,16].
Python is interpreted rather than compiled, as C for example. It is multi-
paradigm supporting imperative, object-oriented and declarative functional pro-
gramming. These features make it an ideal candidate for complementing C
(imperative and compiled lower-level system programming language) for intro-
ducing the basics of algorithm design and development. Python is close to pseu-
docode and it greatly supports fast prototyping and algorithm testing through
a variety of tools. Last but not least, Python benefits of a rich and growing set
of packages and libraries that support programming tasks in a wide range of
application domains (see for example [13] that introduces a Python package for
supporting physics laboratories).
There are several relatively independent perspectives on which one nowa-
days can judge the popularity of a programming language taking into account
diverse factors like business adoption, job opportunity, and educational support.
Moreover, there is currently a trend in industry and academia for estimating
the (possibly future) popularity of programming languages using empirical data
analysis by producing “language rankings”. Basically, such more or less objec-
tive rankings differ in purpose, methodology employed to obtain them and data
sources on which the analysis was based.
For the purpose of supporting our decision in adopting Python in our courses,
we considered Python popularity as reflected by IEEE Top Programming Lan-
guages and Project Euler – two initiatives that differ in scope and purpose and
that we consider to be very relevant for our own purpose. While IEEE Top Pro-
gramming Languages has a very broad scope and it aims to provide general and
scientifically sound rankings of programming languages based on rigorous data
analysis metrics, Project Euler has a very narrow scope aiming to provide a solid
training environment for people interested in mastering problem solving at the
border of mathematics and computer science, using computer programming.
The IEEE Top Programming Languages [4] is a data-powered interactive
application that was developed and updated starting with 2014. The applica-
tion is promoted by IEEE Spectrum online magazine and it is released yearly
Table 1. Python yearly rankings by IEEE Top Programming Languages
Year(s) Spectrum Trending Job Open

2019 1 1 1 1
2017–2018 1 1 1 1
2016 3 3 3 2
2015 1 1 1 1
2014 4 4 4 4
with updated data extracted from many relevant sources: Google Search, Google
Trends, Twitter, GitHub, Stack Overflow, Reddit, Hacker News, CareerBuilder,
IEEE Job Site, and IEEE Xplore. The selection of utilized data sources has a very
broad coverage, including both academic and industry relevant repositories. The
application is configured to support 4 default rankings (IEEE Spectrum, Trend-
ing, Jobs, Open). The ranking is adjustable by manually configuring the weights
of the included data sources.
Table 1 presents Python yearly rankings by IEEE Top Programming Lan-
guages. This table clearly shows that Python language holds the highest rank
during the last 3 years, independently of the ranking criteria.
Project Euler is serving a community of users interested in mastering problem
solving on the border of computer science and mathematics using algorithms and
computer programming. So the community has both educational/training and
intellectually challenging goals.
Currently there are 687 problems in Project Euler and their number is con-
stantly growing. The community has 957235 registered members using 100 pro-
gramming languages who have solved at least one problem [6]. Table 2 presents
the 5 highest values of the number of persons using a given programming lan-
guage in Project Euler, showing that Python holds the top position.
Table 2. The top 5 values of the number of persons using a given programming lan-
guage in Project Euler among 957235 registered members actively using 100 program-
ming languages (values recorded at November 9, 2019)
Language Python C/C++ Java C# Haskell

Members 55080 43966 29132 13705 6924
This simple analysis based on sound quantitative figures spanning both

academic and business worlds clearly motivates our choice in augmenting
programming-related academic courses in computer science and engineering cur-
ricula to cover also knowledge of Python.
2.2 Programming-Related Courses
We briefly introduce content and methodology of two courses that we teach to

computer science and engineering students at the University of Craiova: Algo-
rithm Design (AD hereafter) and Artificial Intelligence (AI hereafter). While
these courses have distinct goals aiming to endow students with different learn-
ing outcomes, they both share a strong focus on developing students’ practical
programming skills. Both courses are taught in the first half of the 4-years Bach-
elor program, thus providing students with a solid base required to fulfil their
second half of the Bachelor program.
Algorithm Design Course. The AD course is held in the second semester

of the first year of study in computer science and engineering curricula. The
focus of AD course is on sound development of algorithmic problem solving,
including analysis and design, as well as robust implementation and evaluation
using contemporary programming languages (C and Python in our case). The
learning objectives of AD course are as follows:
– LO1: Principles of algorithm analysis;

– LO2: Principles of data abstraction and modular programming;
– LO3: Fundamental algorithms and methods of algorithm design;
– LO4: Practical experience in programming small-scale experiments involving
implementation, testing and evaluation of algorithms.
The topics that we cover during our 14-weeks AD course are as follows: intro-
duction to analysis and design of algorithms, divide and conquer, correctness and
testing of algorithms, sorting algorithms, abstract data types and lists, stacks
and queues, graphs and trees, dynamic programming, greedy algorithms, back-
tracking, and introduction to NP-completeness. A similar mandatory course is
also part of the Curriculum at the Faculty of Sciences, Novi Sad. It is based on
Java programming language, as it immediately follows the “Introduction to Pro-
gramming in Java” course. However, having in mind the popularity of Python
in local ICT companies as well, we decided to offer a similar course in Python,
as elective one. Some basic elements of Python are planned to be presented to
the students at the beginning of the course.
Practical work is focused on implementation and experiments with algo-
rithms using Standard C and Python programming languages. Students are
encouraged to use online resources [2] that we find appropriate for mastering
introductory programming.
Students benefit for being previously exposed to the basics of computer pro-
gramming using Standard C during their first semester. One of our goals is to
guide students during AD course in using C for efficient implementation of algo-
rithms. Moreover, we motivate the selection of Python as a second implementa-
tion language and then we follow by briefly exposing students to Python with a
focus on its basic elements including program structure, functions, modules and
higher-level data structures (lists and dictionaries). As students were already
exposed to the basics of computer programming during the first semester, one
aspect that we encourage and emphasize is self-learning.
It is worth noting key aspects that we emphasize during teaching the AD
course:
– Correctly introduce low-level aspects of data structures: pointers and linked
structures;
– Accurately evaluate performance of different ways of encoding same solu-
tion/algorithm, for example by imperative versus declarative programming;
– Present two different programming languages (C and Python), by outlining
differences regarding low-level versus high-level constructs, also in relation to
previous two items.
Artificial Intelligence Course. It is worth to observe that nowadays AI is

a very popular topic in computer science and engineering. Moreover, currently
established AI techniques have wide impact in virtually any area of science and
engineering. Topics like machine learning and data science are nowadays very
trendy and highly marketed with their relative popularity overtaking other areas
of AI (like Semantic Web) or Computer Science in general (like Software Engi-
neering), as measured with the help of Google Trends search terms analysis tool.
So, based on our own experience of more than 15 years in the field, as well as
by observing the tremendous growth of the machine learning and data science
fields, we postulate two tendencies in teaching introductory AI topics. The first
approach, let us call it “classic AI”, is focused on the traditional topics as found
in classic AI textbooks that are centered around reasoning and knowledge rep-
resentation. With this approach, machine learning is possibly (not necessarily!)
only marginally touched at the end of the course, after the broad coverage of
the previous two topics, or in-depth covered by a separate course, usually taught
after the introductory AI course. The second approach, let us call it “emergent
AI”, is focused on the trendy topics of machine learning, pattern recognition and
data science right from start. Moreover, this approach is often complemented
with elements of applied computational statistics in science and engineering,
thus being more appropriate for general science and engineering audience.
The AI course at the University of Craiova is held in the second semester of
the second year of study in computer science and engineering curricula. Its focus
is on understanding and experimenting with classic AI algorithms addressing
the traditional AI topics of knowledge representation and reasoning, thus being
a “classic AI” course. Its learning objectives are as follows:
– LO1: Problem solving using AI methods and algorithms;
– LO2: Declarative thinking by mastering knowledge representation and rea-
soning;
– LO3: Practical experience of logic-based AI programming;
– LO4: Practical experience of implementing AI algorithms.
The topics that we cover during our 14-weeks AI course are as follows: logic-
based knowledge representation and reasoning, problem solving using blind and
informed search algorithms, constraint satisfaction, probabilistic reasoning and

Bayesian networks, semantic networks for knowledge representation, planning,
introduction to machine learning (classification and regression).
Students benefit for being previously exposed to the basics of computer pro-
gramming, algorithms and data structures and object-oriented programming
using C/C++, Python, and Java during their first year and second year first
semester. On one hand we motivate students to master a declarative logic-based
thinking using Prolog language as programming tool, by highlighting the ben-
efits of this endeavour. On the other hand we aim at providing students with
the necessary knowledge in understanding and implementing various AI problem
solving algorithms for the topics taught at the course by using diverse program-
ming languages. So, with respect to this practical direction of the course, we are
not imposing a specific implementation programming language, students having
the freedom to use a language of their choice, either new or introduced in a
previous course.
3 Results and Discussions

In this section we present and discuss our results obtained by carrying out the AI
and AD course during the academic year 2018–1019. The courses were delivered
to the students using a blended approach by mixing f2f lectures and laboratory
meetings, homework assignment, as well as online content sharing and educa-
tional tasks management using Google Classroom. The aim of our analysis is to
qualitatively and quantitatively evaluate the appropriateness and effectiveness
of using Python for practical programming tasks in both courses in compliance
with the courses’ goals.
3.1 Results for AD Course

Here we have in mind two aspects: i) difficulties encountered when correctly
introducing the elements of Python programming during the lectures, in compli-
ance with the general course focus on algorithm design; ii) quantitative figures
regarding students’ adoption of Python in their practical work.
With respect to the first point, we identified the following educational issues
and we proposed the following solutions:
– Python already provides a variety of high-level data structures of container
type like for example lists. A problem raising many students’ questions is
the confusion between “Python lists” and “linked lists”. For example, linked
lists algorithms were introduced following textbook [5]. Our approach was to
present their explicit C and Python implementation, highlighting similarities,
by following identical algorithms. Then we discussed separately Python lists,
highlighting differences, as well as the many other features of this high-level
structure. Finally, we addressed the comprehensive perspective of abstract
data types, as a general methodology for approaching the understanding and
design of high-level data structures;
– Issues of aliases, shallow and deep copy of Python complex objects were better
explained using pointer diagrams. This was easier to understand after stu-
dents were firstly exposed to low-level details of pointers and references, that
in our opinion are better introduced using C examples. This closed the gap
between high-level Python structures and low-level details that are needed to
correctly understand their implementation;
– Python is multi-paradigm high-level language, therefore same solution can
be realized in different ways raising several issues like verbosity versus con-
ciseness or readability versus efficiency. For example, using high-level Python
features of list and set comprehension can result in very compact representa-
tions of some algorithms. But such different solutions do not have the same
efficiency. Very concise solutions can be often easily read as mathematical
specifications, while at the same time they can be very inefficient from the
running time point of view. We addressed this issue by discussing solution
features like readability, conciseness, theoretical time complexity, as well as
experimental evaluation of running time.
The second aspect concerns students’ practical work during AD course. This
was organized as 2 lab assignments LA1 and LA2 and 1 course assignment CA,
as follows:
– An assignment consisted in programming algorithms using C and possibly

Python (see below). The student had to prepare non-trivial test cases and use
them to experiment with the code. Finally, students had to produce technical
reports describing their achievements;
– For the lab assignments, C implementation was compulsory while Python
implementation was optional. For the course assignment both C and Python
implementation were compulsory.
From a total number of 169 students enrolled in AD course, there were 143
submissions of LA1 (21 done also with Python), 138 submissions of LA2 (10
done also with Python), and 138 submissions of CA (90 done also with Python).
Some of the results obtained at the AD assignments are presented in Table 3.
Note that the use of Python in LA1 and LA2 was optional and the use of C was
mandatory, while the use of both languages was mandatory in CA. This explains
why the highest Python usage figures were obtained in CA.
Table 3. Assignment results in AD course. Each assignment was graded with a number
of points from 0 to 15.
Python C
0–4 5–9 10–15 0–4 5–9 10–15
LA1 3 6 12 1 16 126
LA2 1 7 2 3 55 80
CA 2 18 70 3 24 101
3.2 Results for AI Course

The focus of our discussion for this course is set on students’ practical work. The
course lectures are addressing AI algorithms and methods, while practical work
consists in programming tasks. This was divided into 2 lab assignments and 1
course assignment (homework), as follows:
– The first lab assignment (LA1) required the use of Prolog (compulsory) to
solve logic-based reasoning and representation problems;
– The second lab assignment (LA2) required the use of Python (compulsory)
to solve an AI-based algorithmic problem;
– For the course assignment (CA), students were allowed to use a programming
language of their own choice, by motivating their decision. They were provided
with an initial list of possible options;
– Assignment tasks required students to approach AI problems and algorithms
addressing AI methods discussed during the lectures. For each solution, they
had to prepare non-trivial test cases and use them to experiment with the
code. Finally, they had to describe their achievements in a technical report.
From a total number of 194 students enrolled in AI course, there were 140
submissions of LA2 (all done with Python) and 165 submissions of CA (121
done with Python). Some of the results obtained at the AI assignments are
presented in Table 4. Results of LA1 are not shown, as LA1 involved the use
of Prolog, and is not relevant here. Also, the use of Python was mandatory for
LA2, thus explaining the figures shown in the table. Finally, in CA, the choice
of the programming language was left to the students, and we can easily notice
that the highest figures were obtained with Python by all students with low,
average and high performance.
Table 4. Assignment results in AI course. Each assignment was graded with a number
of points from 0 to 100.
LA2 CA
0–49 50–79 80–100 0–49 50–79 80–100
Python 30 80 30 24 22 75
C/C++ − − − 15 11 4
Java − − − 2 10 1
JavaScript − − − 0 1 0
4 Conclusions and Future Works

Some few conclusions can be drawn taking into account our recently gathered
experiences by using Python programming language in AD and AI courses for
computer science and engineering students, as well as by preparing the “Data
Structures” and “Social Networks Mining” courses at the Faculty of Sciences,

Novi Sad.
Firstly, the hints given to 1st year students during AD course regarding self-
instruction with Python were very useful. Students were advised to self-train
using Python by solving simple algorithmic problems on Project Euler, and this
strategy turned out to be very effective. In conclusion, students needed only
a minimalist exposure to Python before being able to approach their Python
programming tasks, in particular their course assignment. For example, most of
1st year students responded well to Python adaptation, about 75% obtaining
good results at the course assignment.
Secondly, it can be easily noticed that 2nd year students did better with
Python programming, which is somehow an obvious conclusion. On one hand,
they gathered additional Python knowledge from the Object-Oriented Program-
ming course, while on the other hand they also benefited from our very light
introduction to Python that we did during 2017–2018 AD course (results not
shown here).
Thirdly, comparing our results of AI course from 2017–2018 (results not
shown here) with the most recent ones from 2018–2019, we noticed an obvious
growth of the students’ interest in Python. Note that students of 2017–2018 AI
course did not benefit from our introduction to Python in previous year 2016–
2017, as this initiative emerged in 2017–2018. So their achievements were mainly
based on self-training.
As future plan to increase and improve the use of Python in our
programming-related courses, we intend to dedicate at least one lab session to
teaching Python during the next AD course in 2019–2020, as well as to better
motivate students to use Python also for their lab assignments, by providing
them with a grade bonus. Finally, we will expand our results by also evaluating
the impact of the 2018–2029 AD course regarding Python programming to this
year adoption of Python during AI course (2019–2020).
Finally, based on the positive results obtained by introducing Python in 2
courses at the University of Craiova, as well as by proposing new Python-based
elective courses at the Faculty of Sciences, Novi Sad, we should definitely try
to motivate students to select these elective courses in the near future. One
possibility is to deliver crash seminar on Python programming language just to
present essential elements of language and introduce it to students.
References
1. Bădică, C., Vidaković, M., Ilie, S., Ivanović, M., Vidaković, J.: Role of agent mid-
dleware in teaching distributed systems and agent technologies. J. Comput. Inf.
Technol. 27(1) (2019). http://cit.fer.hr/index.php/CIT/article/view/4464
2. Becheru, A., Bădică, C.: Online resources for teaching programming to first year
students. In: Vlada, M., Albeanu, G., Adascalitei, A., Popovici, M. (eds.) Pro-
ceedings of the 11th International Conference on Virtual Learning, pp. 138–144.
Bucharest University Press (2016). http://c3.icvl.eu/papers2016/icvl/documente/
pdf/section1/section1 paper17.pdf
3. Brusilovsky, P., Malmi, L., Hosseini, R., Guerra, J., Sirkiä, T., Pollari-Malmi, K.:
An integrated practice system for learning programming in Python: design and
evaluation. Res. Pract. Technol. Enhanc. Learn. 13(1) (2018). https://doi.org/10.
1186/s41039-018-0085-9
4. Cass, S.: The Top Programming Languages 2019. IEEE Spectrum, 06 Septem-
ber 2019. https://spectrum.ieee.org/computing/software/the-top-programming-
languages-2019
5. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms,
3rd edn. The MIT Press, Cambridge (2009)
6. Project Euler. https://projecteuler.net/
7. Fangohr, H.: A comparison of C, MATLAB, and Python as teaching languages
in engineering. In: Proceedings 4th International Conference on Computational
Science – ICCS 2004 (Part IV). Lecture Notes in Computer Science, vol. 3039,
pp. 1210–1217. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-
25944-2 157
8. Hromkovič, J., Kohn, T., Komm, D., Serafini, G.: Combining the power of Python
with the simplicity of logo for a sustainable computer science education. In: Brod-
nik A., Tort F. (eds.) Informatics in Schools: Improvement of Informatics Knowl-
edge and Perception, ISSEP 2016. Lecture Notes in Computer Science, vol. 9973,
pp. 155–166. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46747-
4 13
9. Ivanović, M., Budimac, Z., Radovanović, M., Savić, M.: Does the choice of the first
programming language influence students’ grades? In: Proceedings of the 16th
International Conference on Computer Systems and Technologies, CompSysTech
2015, pp. 305–312. ACM (2015). https://doi.org/10.1145/2812428.2812448
10. Ivanović, M., Xinogalos, S., Pitner, T., Savić, M.: Technology enhanced learning in
programming courses - international perspective. EAIT 22(6), 2981–3003 (2017).
https://doi.org/10.1007/s10639-016-9565-y
11. Joint Task Force on Computing Curricula, Association for Computing Machinery
(ACM) and IEEE Computer Society: Computer Science Curricula 2013: Curricu-
lum Guidelines for Undergraduate Degree Programs in Computer Science. ACM
and IEEE Computer Society, 20 December 2013. https://doi.org/10.1145/2534860
12. Klimeková, E., Tomcsányiová, M.: Case study on the process of teachers transition-
ing to teaching programming in Python. In: Informatics in Schools, Fundamentals
of Computer Science and Software Engineering, ISSEP 2018. Lecture Notes in
Computer Science, vol. 11169, pp. 216–227. Springer, Cham (2018). https://doi.
org/10.1007/978-3-030-02750-6 17
13. Martin, R.D., Cai, Q., Garrow, T., Kapahi, C.: QExpy: A python-3 module to
support undergraduate physics laboratories. SoftwareX 10, 100273 (2019). https://
doi.org/10.1016/j.softx.2019.100273
14. Python. https://www.python.org/
15. Vergnaud, A., Fasquel, J.-B., Autrique, L.: Python based internet tools in control
education. IFAC-PapersOnLine 48(29), 43–48 (2015). https://doi.org/10.1016/j.
ifacol.2015.11.211
16. Xinogalos, S., Pitner, T., Ivanović, M., Savić, M.: Students’ perspective on the
first programming language: C-like or Pascal-like languages? EAIT 23(1), 287–302
(2018). https://doi.org/10.1007/s1063
Validating the Shared Understanding
Construction in Computer Supported
Collaborative Work in a Problem-Solving
Activity
Vanessa Agredo-Delgado1,2, Pablo H. Ruiz1,2, Alicia Mon3,

Cesar A. Collazos2, Fernando Moreira4,5(&), and Habib M. Fardoun6
1
Corporación Universitaria Comfacauca - Unicomfacauca, Popayán, Colombia
{vagredo,pruiz}@unicomfacauca.edu.co
2
Universidad del Cauca, Popayán, Colombia
[email protected]
3
Universidad Nacional de La Matanza, Buenos Aires, Argentina
[email protected]
4
REMIT, IJP Universidade Portucalense, Porto, Portugal
[email protected]
5
IEETA, Universidade de Aveiro, Aveiro, Portugal
6
King Abdulaziz University, Jeddah, Saudi Arabia
[email protected]
Abstract. Computer-Supported Collaborative Work is a multidisciplinary

research field and is the core of our society, forged with difficulties and benefits,
however, one of its main problems is that the collaboration success is difficult to
achieve and probably impossible to guarantee or even predict. Given that col-
laboration is a coordinated, synchronized activity and it is the result of a con-
tinuous attempt to construct and maintain a shared conception of a problem, it
can be inferred that for collaboration to occur, there must be a shared under-
standing of the problem that it is being resolved. For this reason, the shared
understanding of the task is an important determinant of the performance of
groups. This is why this paper presents an initial proposal of a process for the
shared understanding construction in a problem-solving activity, specifically, it
shows the validation of the feasibility and usefulness of the process in this
construction. For validation, an experiment was carried out with students from
two Latin American universities that verified the shared understanding con-
struction through the proposed process, confirming the experiment hypotheses
about its feasibility and utility and, in this way, was discovered aspects to
improve from the ambit of the high cognitive load that generates and the need to
monitor and assist to this shared understanding.
Keywords: Computer supported collaborative work Shared understanding

Problem-solving activity Process Improvement
204 V. Agredo-Delgado et al.
1 Introduction
Computer-Supported Collaborative Work (CSCW) addresses how collaborative

activities and their coordination can be supported by means of computer systems [1].
However, working collaboratively is not an easy task, one of its main problems is that
collaboration is hard to achieve and probably impossible to guarantee or even predict
[2]. Rummel and Spada [3] argue that collaboration does not occur as easily as one may
expect, there are factors that affect its achievement [4]. It is for this that for guarantees
effective collaboration and consequently, improve the collaborative work, it is neces-
sary deeper approach must be taken to ensure collaboration through the analysis of
external factors [5]. That is why CSCW provides individuals and organizations with
support for group collaboration and task orientation in distributed or networked settings
[6]. Looking for this task orientation, CSCW also can divide into 3 phases Pre-process,
Process, and Post-Process [7]. Using these phases, researches has been conducted to
improve collaboration among group members but in the learning context [8, 9], and
there have also been approaches to improve different aspects of collaborative work [10,
11]. These works have in common that they pay special attention to the processes
followed and the software tools to help communication and interaction among team
members, but the critical cognitive aspects that ensure that the team works effectively
and efficiently toward a common objective are frequently absent [12]. One of these
cognitive processes is shared understanding, which is known that its existence in the
collaborative work process among all involved actors is one requisite for its successful
implementation [13]. Analyzing the need to improve collaborative work and the
benefits that shared understanding brings, we saw the possibility to incorporate it as
part of the Process phase, since groups who are engaged must have some knowledge
and understanding in common, which functions as a joint reference base for working
productively [14]. Shared understanding refers to the degree to which people concur on
the interpretation of the concepts, when sharing a perspective (mutual agreement) or
can act in a coordinated manner [15], it is also an important determinant for perfor-
mance and a challenge in heterogeneous groups [14] since the group members might be
using the same words for different concepts or different words for the same concepts
without noticing [16], these differences can interfere with the productivity of collab-
orative work if they are not clarified early [17]. For this reason, this paper focuses on
validating the feasibility and usefulness of an initial proposal of a process for the shared
understanding construction in a problem-solving activity, through the execution of an
experiment with students from two Latin American universities with the purpose of
investigating if with the process proposed, the shared understanding can be constructed
in a problem-solving activity.
This paper is structured: Sect. 2 contains a process initial proposal description.
Section 3 describes related work, Sect. 4 describes the experimentation, the results, and
its analysis. Section 5 has conclusions and future work.
Validating the Shared Understanding Construction in CSCW 205
2 Process Proposal Description
A proposal of an initial process is presented that contains phases, stages, activities, and
steps that will allow executing a collaborative work in problem resolving activities and
thus allow to achieve a shared understanding. For developing the collaboration process,
we followed the collaboration engineering design approach [18], which addresses the
challenge of designing and deploying collaborative work practices for high-value
recurring tasks and transferring them to practitioners to execute without the ongoing
support from a professional collaboration expert [16]. To model the process, we use the
conventions based on the elements proposed by Spem 2.0 [19].
According to our proposed process, the computer-supported collaborative work we
divide it into 3 phases, Pre-Process, Process, and Post-Process, which were taken from
Collazos’s work [7], phases that were improved and adapted to the collaborative work.
The first phase Pre-Process begins with the activity design and specification, in the
Process phase, the collaboration activity is executed to achieve the objectives based on
the interaction among group members. At the end of the activity, in the Post-Process
phase, the activity coordinator performs an individual and collective review to verify
the proposed objective achievement.
For the first Pre-Process phase, its activities were updated, to each one of them was
assigned with the respective description, the responsible person, the inputs and outputs
of such activity. This paper focuses mainly on the Process phase since it is here where
the collaborative work interactions take place, where we can obtain shared under-
standing. For this phase, four stages were defined (See Fig. 1), each one with activities,
steps, roles, inputs and outputs.
Fig. 1. Stages of the process phase
The Organization stage aims to be a stage in which the coordinator, organize all the
necessary elements to start with the activity.
The Shared understanding stage seeks to get the group members to agree on what
the problem to solve in the activity before starting their development, this stage is
formed by (See Fig. 2), the Tacit Pre Understanding activity which underlies people’s
ability to understand individually, The Construction activity happens when one of the
group members inserts meaning by describing the problematic situation. The fellow
teammates are actively listening and trying to grasp the given explanation, the Col-
laborative Construction activity is a mutual activity of building meaning by re-fining,
building or modifying the original offer, and finally the Constructive Conflict activity,
which is where the differences of interpretation between the group members are treated
through arguments and clarifications. These last three activities are based on the group
cognition research from learning sciences and organizational sciences of Van den
Bossche et al. [15] who examined a model of the team learning behaviors, that we
adapted in our research to be used in collaborative work.
Fig. 2. Shared understanding activities
Considering these activities, we define for each one a series of tasks that allowed to
achieve its objective. The detail of the process is only defined until the shared under-
standing stage, it is intended that in the next moments of the research, to continue
improving, refining and detailing the all proposed process, so that later it can be
completely validated.
3 Related Work
Researches have focused on the measurement of shared understanding, but not on the
construction of it, here are some of them: Smart [20] used a cultural model, where the
nodes represent concepts and its linkages reflect the community’s beliefs. Rosenman
et al. [21] worked with interprofessional emergency medical teams, where they mea-
sure the shared understanding through team perception and a team leader effectiveness
measure. White et al. [22], describe a range of techniques, as the use of concept maps,
relational diagrams, and word association tests. Sieck, et al. [23] determined that the
similarity of mental models might provide a measure of shared understanding. Bates
et al. [24] developed and validated the Patient Knowledge Assessment tool question-
naire that measured shared clinical understanding of pediatric cardiology patients.
On the other way, there are works about collaborative problem solving (CPS) as:
Edem [25] examines the occurrences of the target group of CPS activities, as well as
individual contributions. Roschelle et al. [26] focus on the processes involved in the
collaboration, where they concluded that the students used language and action to
overcome impasses in shared understanding and to coordinate their activity. Barron
[27] identified 3 dimensions in the interactive processes among the group, the mutuality
of exchanges, the achievement of joint attentional engagement, and the alignment of
the goals. Häkkinen et al. [28] present their pedagogical framework for the twenty-first-
century learning practices, among those that are collaborative problem-solving skills
and strategic learning skills. Graesser et al. [29] developed an assessment of CPS of
student skills and knowledge, by crossing three major CPS competencies with four
problem-solving processes. The CPS competencies are (1) establishing and maintaining
shared understanding, (2) taking appropriate action, and (3) establishing and main-
taining team organization.
4 Experiment
Context. Was conducted in a university environment in which participated: 45 last

semester students of Universidad de la Matanza - UM (Argentina) with a high level of
experience in the activity topic, for this group the proposed model was applied,
whereas 15 students Universidad Nacional de la Plata - UP (Argentina) participated,
enrolled in the third year, with an intermediate level of experience in the topic, where
the proposed model was not applied.
Objective. To inquire about the feasibility and utility of the proposed initial process
for the shared understanding construction in a problem-solving activity. For this, the
research question: How feasibility and useful is this proposed process? This study has
one analysis unity, which is the academic context, where a problem-solving activity
about process lines.
Hypothesis. Considering the objective, it is intended to evaluate the following
hypotheses:
• The proposed initial process is feasible for the construction of shared understanding
in a problem-solving activity.
• The proposed initial process is useful for achieving the objectives of the problem-
solving activity.
In order to refine the previous hypotheses, the following specific hypotheses were
raised (See Table 1.):
Table 1. Specific hypotheses

Hypothesis
Feasibility H1.1. The descriptions made by participants about what should they do in the
activity, improvement with the use of the process
H1.2. The participants understand and are agree with the descriptions of what
should be done in the activity, from their other groupmates
H1.3. The use of the process improvement in the group the homogeneous
understanding and the discrepancy of each participant against what they should
do in the activity, defined by the others
H1.4. The use of the process improvement the activities results of the Shared
Understanding stage
Utility H2.1. The use of the process improves the quality of the final results obtained
when performing the problem-solving activity
H2.2. The number of questions asked to the activity coordinator decreases with
the process use
H2.3. The process use improves the perception of the participants’ satisfaction,
about the achievement of the objectives set by the activity carried out
H2.4. The use of the process improves the perception of the participants’
satisfaction with the process elements and with the activity outcome
4.1 Experiment Design

As planning for the experiment development, we have: The Pre-Process phase with
each activity, a duration of 1 h and 25 min, and having as support instruments: A
software tool, which provides a step by step through forms, that finally generate a pdf
with the design and definition of necessary elements. the Process phase with each
activity, a duration of 2 h and 45 min, having as support instruments: Software tool for
group formation, basing heterogeneity in the learning styles, the formats for: to write
the understanding about the problem, to write the questions or disagreements, to
classify the understanding of the other members, to classify their own understanding,
the group writes the under-standing where everyone agrees, to solve the problem and
survey format with 24 questions.
It is important to clarify that, the support instruments used were subjected to several
revisions in which two members of the IDIS research group of the Universidad del
Cauca and a member of the GIS group of the Universidad de la Matanza participated.
In addition to that, we first conducted a focus group with two experts on group work
and collaboration engineering before they should be implemented in practice.
4.2 Execution of the Experiment

The collaborative activity that the groups UM and UP should carry out consisted of
solving a process-line problem. In both universities, the activity was developed without
computational support. The time used for applying the proposed process in UM was
3 h 55 min, and for the UP it was 2 h and 40 min.
4.3 Analysis
There are several different kinds of results from the experiment: the observation made
by the researchers, where it could be observed that those groups that obtained poor
results (in terms of notes) were those that did not have a good performance in the
process application, did not generate discussions internal to solve doubts, they did not
appropriate the assigned role, and did not have the disposition to work in group. Also, it
was found that just a text-based collaboration is inconvenient for problem-solving
tasks, the process should include additional ways of communication among the par-
ticipants. It was also observed that following the process was exhausting for the par-
ticipants and that this generates a lack of commitment to the rest of the activity, due to
its high cognitive load.
On the other hand, to ensure that the differences in the results found are not only
apparent but statistically significant was used the student’s t-distribution [30], which
allowed validating the specific hypotheses. Depending on the information to be ana-
lyzed, there are three types of test a) T-test for means of two paired samples, b) T-test
for two samples with equal variances c) T-test for two samples with unequal variances.
For the type of T-tests, the values that were used to make the calculation: Relia-
bility level = 95%, Significance level = 5%, Critical value in two-tailed, Observations
or cases = 9 for the T-tests type a) and 9 (UM), 3 (UP) for T-tests type b) and c),
Degrees of freedom = 8 for T-tests type a) and 10 for T-tests type b) and c).
For the T-tests type b) and c), initially, it was necessary to determine if the vari-
ances of the values were equal or unequal, for this we use the Fisher test [31].
We also consider for the 3 test types, for the acceptance or rejection of the null
hypothesis:
• If P-value or F-Value <= Significance level, the null hypothesis is rejected
• If P-value or F-Value > Significance level, the null hypothesis is accepted
Applying the statistical analysis in the values obtained, the following results of
Table 2 were generated.
4.4 Discussion
Statistically, it was verified that the process used, improves the participants’ individual
understanding, improve the group understanding about the activity, generate a
homogeneous understanding of the activity, it does not generate a discrepancy of each
participant regarding the group understanding, in the same way, with the use of the
process the shared understanding activities generated better results and were better
fulfilled among the participants, it was also obtained that the participants have high
clarity and understanding about the descriptions of their peers, this perhaps because at
the beginning everyone can have the same doubts or the same mistakes. With the final
artifact generated, it was validated that the use of the process generates final products
with better quality levels. With respect to the questions generated to the coordinator of
the activity, with the use of the process a smaller amount is generated since the
activities allow to resolve internally the greatest number of questions. The process
allowed to obtain better achievement participants’ satisfaction with the objectives
proposed by the activity. Conversely, it cannot be determined that the elements of the
process are satisfactory for the participants and in the same way, the outcomes of the
activity. With the observation, it was possible to determine that the process generates a
high cognitive load before starting the development of the activity, which does not
allow the participants to carry out the activity with the necessary interest since It is a
process that contains many steps.
4.5 Threats
Construct Validity: The shared understanding construction was observed and mea-
sured by the perceptions of the participants, but the constructs underlying these
behaviors are still unknown. In order to minimize the subjectivity in the support
instruments for the information collection, these underwent validations by expert
personnel. Another threat is the incorporation of new conceptual and language ele-
ments in the activity development to the participants, in order to reduce this threat, an
initial activity was assigned in which were contextualized in the activity theme.
Internal Validity: We analyzed the results of applying the guide but not the commu-
nication of the participants, for trying to minimize this threat, the participants operated
the process in the presence of the observer though, without, the participants were
encouraged to write down their questions and issues. Another validity threat may be the
Table 2. Results for each hypothesis

Variable Results Hypothesis accepted
H1.1 Improvement in the t (9) = 2,31; H1.1.2a = There is a statistically
descriptions by group P (0.005) significant difference in the
average of notes between
individual and group descriptions
Improvement in the F-Value = 0, 27; H1.1.4a = There is a statistically
descriptions UM and UP t (9,3) = 2, 61; significant difference in the
P (0.026) average of notes for group
descriptions between UM and UP
participants
H1.2 The understanding other 81, 6% H1.2.2a = The perception
descriptions percentage about the level of
understanding that participants
have in front of the descriptions of
other group participants, is greater
or equal than 60%
The opinion of other 73, 9% H1.2.4a = The perception
descriptions percentage about the level of
opinion that participants have in
front of the descriptions of other
group participants, is greater or
equal than 60%
H1.3 Improvement in the t (9) = 4, 95; H1.3.2a = There is a statistically
homogeneous understanding P (0, 011) significant difference in the
average of results obtained from
the homogeneous understanding
of the group before and after the
use of the proposed process
Improvement in the t (9) = 5, 20; H1.3.4a = There is a statistically
discrepancy P (0, 0008) significant difference in the
average of results obtained from
differences in individual
knowledge versus group
knowledge, before and after the
use of the proposed process
homogeneous understanding t (9,3) = 2, 35; significant difference in the
in UM and UP P (0, 041) average of results obtained from
the homogeneous understanding
between the UM and UP groups
(continued)
discrepancy in UM and UP t (9,3) = 3, 90; significant difference in the
P (0,002) average of results obtained from
differences in individual
knowledge versus group
knowledge, between the UM and
UP groups
H1.4 Improvement in the F-Value = 0, 97; H1.4.2a = There is a statistically
Construction activity t (9,3) = 2, 79; significant difference in the
P (0, 019) average of results obtained from
the activities of construction
Improvement in the Co- F-Value = 0, 70; H1.4.4a = There is a statistically
Construction activity t (9,3) = 2, 32; significant difference in the
the activities of Co-construction
Constructive conflict activity t (9,3) = 2, 30; significant difference in the
the activities of Constructive
conflict between the UM and UP
groups
H2.1 Improvement in the quality F-Value = 0, 12; H2.1.2a = There is a statistically
of the results t (9,3) = 2, 42; significant difference in the
P (0,036) average of the notes from the
results after applying the guide
between the UP and UM groups
H2.2 Improvement in the number F-Value = 0, 21; H2.2.2a = There is a statistically
of questions t (9, 3) = 15, 32; significant difference in the
P (0,000000028) number of questions asked to the
activity coordinator between the
UM and UP groups
H2.3 Improvement in the F-Value = 0, 60; H2.3.2a = There is a statistically
perception about the t (9, 3) = 2, 88; significant difference in the
achievement of the P (0,016) average of results obtained from
objectives satisfaction perceived by the
participants about the attainment
of the objectives between the UM
and UP groups
(continued)
H2.4 Improvement in the F-Value = 0,09; H2.4.10 = There is no statistically
perception about the t (9,3) = 1,36; significant difference in the
satisfaction with the process P (0, 204) average of results obtained from
elements satisfaction perceived by the
participants about process items
Improvement in the F-Value = 0, 13; H2.4.30 = There is no statistically
perception about the t (9, 3) = 0.68; significant difference in the
satisfaction with the activity P (0, 514) average of results obtained from
outcome satisfaction perceived by the
participants about activity
outcomes between the UM and
UP groups
time invested since they are long sessions where participants in the final stages may
perceive fatigue, which may influence the results. To try to mitigate it in the midst of
experimentation, participants took a break without communication between them.
External Validity: The guide that they had to follow the participants was about a
problem solution of process lines, this topic is very little analyzed with university
students. We tried to mitigate this effect by looking for groups who had a higher level
of experience with the subject.
From the experiment, we can conclude that the proposed initial process is feasible for
the construction of shared understanding in a problem-solving activity and is useful for
achieving its objectives. However, it cannot be determined that it improves the per-
ception of the participants ‘satisfaction about the achievement of the objectives set by
the activity performed, and, about the process elements and with the activity outcomes.
In addition, the main contribution to collaboration engineering practice is a validated
process proposal through an experiment research study, that can be used by designers
of collaborative work practices to systematically and repeatedly induce the develop-
ment of shared understanding in heterogeneous groups. As shared understanding has
been identified as crucial for collaboration success in heterogeneous groups, the
compound process presented may foster better group processes and better results.
While we used existing measurement items for shared understanding for our survey
combined with open exploration, a need is revealed for more advanced measurement
instruments that allow all categories of shared understanding to be identified, in
addition to the need to include monitoring and assistance mechanisms that allow
maintain it during the development of the activity, since when achieved, this can also
be lost in the process. In the same way, although the results of this study are stable and
promising, we identify as future work the need for further investigation of mechanisms
leading to shared understanding, at better understanding the complex phenomenon, its
antecedents, and effects, thus generating more promising opportunities for developing
more techniques to leverage its benefits for effective group work. Considering that the
process should become lighter so that the cognitive load is avoided at the beginning of
the activity.
References
1. Carstensen, P.H., Schmidt, K.: Computer supported cooperative work: new challenges to
systems design. In: Itoh, k (ed.) Handbook of Human Factors, pp. 619–636. CiteSeer (1999)
2. Grudin, J.: Why CSCW applications fail: problems in the design and evaluation of
organizational interfaces. In: Proceedings of the 1988 ACM Conference on Computer-
Supported Cooperative Work, pp. 85–93 (1988)
3. Rummel, N., Spada, H.: Learning to collaborate: an instructional approach to promoting
collaborative problem solving in computer-mediated settings. J. Learn. Sci. 14(2), 201–241
(2005)
4. Persico, D., Pozzi, F., Sarti, L.: Design patterns for monitoring and evaluating CSCL
processes. Comput. Hum. Behav. 25(5), 1020–1027 (2009)
5. Scagnoli, N.: Estrategias para motivar el aprendizaje colaborativo en cursos a distancia
(2005)
6. Hughes, J., Randall, D., Shapiro, D.: CSCW: discipline or paradigm?. In: Proceedings of the
Second European Conference on Computer-Supported Cooperative Work ECSCW 1991,
pp. 309–323 (1991)
7. Collazos, C.A., Muñoz Arteaga, J., Hernández, Y.: Aprendizaje colaborativo apoyado por
computador, LATIn Project (2014)
8. Agredo Delgado, V., Collazos, C.A., Paderewski, P.: Descripción formal de mecanismos
para evaluar, monitorear y mejorar el proceso de aprendizaje colaborativo en su etapa de
Proceso, Popayán (2016)
9. Agredo Delgado, V., Collazos, C.A., Fardoun, H., Safa, N.: Through monitoring and
evaluation mechanisms of the collaborative learning process. In: Meiselwitz, G., (ed.) Social
Computing and Social Media. Applications and Analytics, pp. 20–31. Springer, Vancouver
(2017)
10. Leeann, K.: A Practical Guide to Collaborative Working. Nicva, Belfast (2012)
11. Barker Scott, B.: Creating a Collaborative Workplace: Amplifying Teamwork in Your
Organization, pp. 1–9, Queen’s University IRC (2017)
12. DeFranco, J.F., Neill, C.J., Clariana, R.B.: A cognitive collaborative model to improve
performance in engineering teams—a study of team outcomes and mental model sharing.
Syst. Eng. 14(3), 267–278 (2011)
13. Oppl, S.: Supporting the collaborative construction of a shared understanding about work
with a guided conceptual modeling technique. Group Decis. Negot. 26(2), 247–283 (2017)
14. Christiane Bittner, E.A., Leimeister, J.M.: Why shared understanding matters–engineering a
collaboration process for shared understanding to improve collaboration effectiveness in
heterogeneous teams. In: 46th Hawaii International Conference on System Sciences
(HICSS), pp. 106–114 (2013)
15. Van den Bossche, P., Gijselaers, W., Segers, M., Woltjer, G., Kirschner, P.: Team learning:
building shared mental models. Instr. Sci. 39(3), 283–301 (2011)
16. de Vreede, G.-J., Briggs, R.O., Massey, A.P.: Collaboration engineering: foundations and
opportunities: editorial to the special issue on the journal of the association of information
systems. J. Assoc. Inf. Syst. 10(3), 7 (2009)
17. Mohammed, S., Ferzandi, L., Hamilton, K.: Metaphor no more: a 15-year review of the team
mental model construct. J. Manag. 36(4), 876–910 (2010)
18. Kolfschoten, G.L., De Vreede, G.-J.: The collaboration engineering approach for designing
collaboration processes. In: International Conference on Collaboration and Technology,
Heidelberg (2007)
19. Ruiz, F., Verdugo, J.: Guía de Uso de SPEM 2 con EPF Composer, Universidad de Castilla-
La Mancha (2008)
20. Smart, P.R.: Understanding and shared understanding in military coalitions, Web & Internet
Science, Southampton (2011)
21. Rosenman, E.D., Dixon, A.J., Webb, J.M., Brolliar, S., Golden, S.J., Jones, K.A., Shah, S.,
Grand, J.A., Kozlowski, S.W., Chao, G.T., Fernandez, R.: A simulation-based approach to
measuring team situational awareness in emergency medicine: a multicenter, observational
study. Acad. Emerg. Med. 25(2), 196–204 (2018)
22. White, R., Gunstone, R.: Probing Understanding. The Falmer Press, London (1992)
23. Sieck, W.R., Rasmussen, L.J., Smart, P.: Cultural network analysis: a cognitive approach to
cultural modeling. In: Network Science for Military Coalition Operations: Information
Exchange and Interaction, pp. 237–255 (2010)
24. Bates, K.E., Bird, G.L., Shea, J.A., Apkon, M., Shaddy, R.E., Metlay, J.P.: A tool to
measure shared clinical understanding following handoffs to help evaluate handoff quality.
J. Hosp. Med. 9(3), 142–147 (2014)
25. Quashigah, E.: Collaborative problem solving activities in natural learning situations: a
process oriented case study of teacher education students, Master’s thesis in Education, Oulu
(2017)
26. Roschelle, J., Teasley, S.D.: The construction of shared knowledge in collaborative problem
solving. In: Computer Supported Collaborative Learning, pp. 69–97, Heidelberg. Springer
(1995)
27. Barron, B.: Achieving coordination in collaborative problem-solving groups. J. Learn. Sci. 9
(4), 403–436 (2000)
28. Häkkinen, P., Järvelä, S., Mäkitalo-Siegl, K., Ahonen, A., Näykki, P., Valtonen, T.:
Preparing teacher-students for twenty-first-century learning practices (PREP 21): a
framework for enhancing collaborative problem-solving and strategic learning skills. Teach.
Teach.: Theory Pract. 23, 25–41 (2017)
29. Graesser, A.C., Foltz, P.W., Rosen, Y., Shaffer, D.W., Forsyth, C., Germany, M.-L.:
Challenges of assessing collaborative problem solving. In: Care, E., Griffin, P., Wilson, M.,
(eds.) Assessment and Teaching of 21st Century Skills, pp. 75–91. Springer (2018)
30. Neave, H.R.: Elementary Statistics Tables. Routledge, London (2002)
31. Freeman, J.V., Julious, S.A.: The analysis of categorical data. Scope 16(1), 18–21 (2007)
Improving Synchrony in Small Group
Asynchronous Online Discussions
Samuli Laato1(B) and Mari Murtonen2

1
University of Turku, Turku, Finland
[email protected]
2
Tampere University, Tampere, Finland
[email protected]
Abstract. Online courses often select asynchronous tools for teamwork

as it allows temporal freedom for students who might come from differ-
ent time zones or have busy schedules. These solutions work better with
larger groups, where due to the quantity of participants, it is easier to
get replies faster. In this study, we investigate challenges that arise in
asynchronous discussions with small group (4–5 participants). Empirical
data was collected from the UNIPS pedagogical employee training online
course Becoming a Teacher and its teamwork period, where Google Docs
was used as a discussion platform by 42 students. We observed that
(1) discussion activity peaked around deadlines (2) students often came
online in vain as their team members had not replied yet and (3) when
students were online simultaneously, they were not able to take advan-
tage of this by engaging in synchronous communication. As solutions, we
propose improving the synchrony of the communication via more struc-
tured instructions and increasing the affordances of the communication
tools.
Keywords: Asynchronous discussions · Online courses ·

Communication tools · Google Docs
1 Introduction
In this study we look at asynchronous discussions in online courses where the

number of participants is small. We observe ten groups of 4–5 students who were
tasked to comment and discuss each others essays using Google Docs during a
UNIPS pedagogical online course [19]. The aim and purpose of this work is
to identify key issues in such discussions and propose theory-based solutions
to improve engagement, participation and learning of students. This paper is
structured as follows: first, relevant work is discussed in the background section.
Then, the research methodology is presented followed by the results. The paper
ends with discussion on the findings and ideas for future work.
216 S. Laato and M. Murtonen
2 Background
Historically synchronous communication required participants to be in the same
place at the same time. When the term was adopted to describe online commu-
nication, the spatial requirement faded away leaving only the temporal, as the
internet allows communication over distance. Thus, synchronous online com-
munication is currently defined to be conversations which take place in real
time [24] or Communication in an online setting that requires simultaneous par-
ticipation [29].
On the flip side of synchronous communication is the asynchronous. In west-
ern society people partake in asynchronous discussions everyday. Emails, text
messages, voice messages and discussion forums are just some examples of asyn-
chronous communication. In e-learning and elsewhere, asynchronous discussions
are widely used for their convenience - as participants do not need to be online
at the same time, they can communicate at a time which they find conve-
nient [4,14,25]. For many, it has become the preferable choice over synchronous
alternatives. For example, the youth are showing a trend of preferring messag-
ing over phone calls [3] and students have been found to rather communicate
with faculty in an asynchronous manner instead of traditional or virtual office
hours [21]. Also before synchronous meetings can even be held, they are often
first agreed to asynchronously.
Asynchronous discussions are also criticized. They provide less diverse com-
munication opportunities and lack the psychological motivating effects of syn-
chronous discussions such as social arousal and increased exchange of social sup-
port [14]. Asynchronous discussions have been shown to hinder the outcomes of
cooperation in comparison to synchronous communication [28]. These drawbacks
can mostly be attributed to the root cause that defines asynchronous discussions:
delayed feedback [26]. Immediate feedback has been found to motivate humans
and allow them to take their ideas further [18]. This can be due to humans hav-
ing limited cognitive capacity, and the working memory of humans will be filled
with other things as time progresses, hindering the ability to effectively respond
when feedback is delayed [8]. On the other hand, asynchronous messages can be
re-read over and over again, providing the opportunity to meditate on specific
parts that require thought.
2.1 Asynchronous Learning in Online Courses

A study by Swan identified three main factors affecting student satisfaction in
online asynchronous discussions: clarity of design, interaction with instructors,
and active discussion among course participants [30]. A more recent study took
a different approach and looked at which one of the three, (1) commenting, (2)
viewing and (3) voting had the biggest impact on peer learning and performance,
and arrived in the conclusion that viewing had the biggest impact [6]. In light
of these findings it seems that simply looking at commenting activity or even
content does not reveal the whole picture on whether discussions are successful
or not.
Synchronizing Asnychronous Discussions 217
In online learning, an asynchronous discussion group of less than 10 students

is considered small [5], and, when the discussions are non-mandatory, only a lim-
ited amount of students participate in commenting [5], even though more can be
viewing comments [6]. Thus, to increase learner participation in asynchronous
discussions, at least non-mandatory ones, increasing the amount of participants
will also lead to an increase in discussion activity [5]. With regards to inter-
action with the facilitator, less intervention can lead to more comments made
by the students [1]. Some moderation can, however, be needed in discussions,
especially if participants maintain anonymity, as trolling can emerge and spoil
the discussion [13].
For a team to operate effectively, simply using one type of communication
(asynchronous) is typically inadequate. A delicate balance between both syn-
chronous and asynchronous is needed [2,9,23,33] as both have strengths and
weaknesses [20]. Scholars including Lynette Watts have also reminded that there
are technological and time-constraint aspects among others which need to be con-
sidered when looking for the optimal solutions for student peer communication
in online courses [32]. Some online courses have allowed their students to pick
their own preferred communication tools, but this only works in certain kinds of
projects, as often in these cases course facilitators are unable to follow the group
discussions, as they take place in an out of reach closed platform.
2.2 Issues with the Binary Categorization

Sorting all online communication into asynchronous and synchronous is com-
monly used in scholarly work (e.g [7,14,24,25,27]). Both types of communication
have associated characteristics which are summarized below in non-exhaustive
Table 1.
Table 1. Characteristics of asynchronous and synchronous discussions
Asynchronous discussions Synchronous discussions

Opportunity to study and re-read [22] Rapid feedback on actions [26]
Less mundane interaction, more focus [22] More interaction, more words [22, 26]
More meaningful messages [15] Social support [14, 28]
However, it is easy to find counterexamples or at least examples challenging

these characteristics. Same technologies and same forms of communication can
be used for both synchronous and asynchronous discussions [29] such as Skype,
WhatsApp, Facebook messenger and Telegram. Discussions can take place when
people are united in the temporal dimension, but also when they are not. This
can be seen in the message culture. When sending letters, it is common etiquette
to begin messages with a greeting and sign them. However, in instant messag-
ing the greetings and signing are often omitted, highlighting that it is the same
continuous conversation, not a turn-based exchange of ideas where each message
is counted as its own entity. E-mails are currently in the process of this disrup-
tion as some, perhaps more formal communication, still include greetings while
increasingly the greetings are omitted. All this constitutes to an increasing blur
between synchronous and asynchronous communication and is a symptom of our
society being “always online”.
As the temporal dimension is in a key role in defining whether the form of
communication is synchronous or asynchronous, we observe when participants
engage in discussion during an online course. With this focus we seek to answer
the following research question: What are the key temporal challenges in peer
communication during online courses? Through identifying these issues we are
then able to theorize solutions based on previous work.
3 Methods
For answering the research question, data from UNIPS employee training ped-
agogical online course Becoming a Teacher which took place in autumn 2017
is used. UNIPS is an open online repository of educational materials which can
be self-studied or completed in guidance with local universities for certificates
or ECTS credits [17,19]. The course Becoming a Teacher is a micro-credential
course worth one credit (ECTS), and has been shown to change conceptions
of pedagogy especially for young learners [31]. 42 students who gave permis-
sion to use their discussions for research participated in a two week teamwork
period where they used Google Docs to comment on each other’s essays on how
they see themselves as teachers. Groups of 4–5 students were formed, and all
students were either PhD students or faculty at the university. The teamwork
period contained loose instructions and minimal participation by the facilitator,
and focused on peer-interaction. Participants were given three deadlines during
the period which were: (1) submit your essay and introduce yourself to others.
(2) Go write at least three comments on each others essays and discuss with
them about the content of their essays and (3) Go reply to all the comments you
received and continue the discussion.
As we analyze the temporal dimension of the discussions, we looked into
obtaining the following information:
– How often do participants come online during a two week discussion period?
– Are there students who are unable to discuss and develop their ideas further
because their group members are not online often enough?
– Did the interaction change if two participants were online at the same time?
4 Results
During the two-week asynchronous team work period we observed clear spikes
in discussion activity right before deadlines. These spikes can be seen in Fig. 1.
One crucial aspect for the success of asynchronous discussions is that students
are online often enough for discussions to be able to occur, which we found
was not the case. In fact, more than half the students commented the bare
minimum, while some did not do even that. Zero students managed to comment
on more than half the days the teamwork period was running. The amount of
days individual students came to comment online can be seen below:
– 0–1 days: 3 students
– 2 days: 23 students
– 5 or more days: 0 students
Fig. 1. Showing how student activity was highest right before or during the deadline
dates 5.11 and 9.11.
The mean participation rate was two days, with the average amount of days
student came to write comments being 2.45. According to these findings, the
majority of students write their comments and questions on one day in the middle
of the team work period and return to reply to the comments they have receive
close to the deadline. This indicates most students are unable to produce effective
discussions during the teamwork period, as their teammates are statistically not
likely to be online for often enough.
Furthermore, we observed situations where student A came online to write
comments and student B replied the next day as visualized in Fig. 2. Student
B then came online the next day, but as Student A had not yet replied, this
time could not be used for discussion. Also cases occurred where both Student
A and Student B were online at the same time, but due to the nature of the
communication platform, they were unable to utilize this simultaneous presence
for more direct synchronous communication.
Fig. 2. The reality of asynchronous online discussions
5 Discussion
5.1 Key Findings

Observing the temporal dimension of asynchronous online discussions revealed
the following issues:
– Discussion activity peaked every time an incremental deadline drew closer.

– Students reserved time to write comments on days where the rest of their
group was yet to reply.
– Even if students were online at the same time, they were not able to harness
this opportunity for more direct higher fidelity communication.
In order to make a better use of students time, the presented data indicates
that more synchronization between students that take part in asynchronous dis-
cussions is needed when the groups are small. An ideal situation to aim for would
be such where students take turns to come online and reply to each other, as
visualized in Fig. 3. But how to get there?
Fig. 3. A more even distribution of time spent.

5.2 How to Add Synchrony in Asynchronous Communication
Academia has come up with solutions to combat the issues described above,
such as the copyrighted Intelligent Discussion Boards [16] and incremental dead-
lines [10]. Also increasing the number of participants has been suggested in the
context of non-mandatory discussions [5], however, it is unclear what kind of an
impact it would have on mandatory communication. Simply forcing students to
come online at specific times defeats the purpose of asynchronous communica-
tion, as one of the reasons projects such as UNIPS are choosing asynchronous
technologies for their courses is that students are not able to come online at
specific times [19]. The trend of being more and more online [12], and the influ-
ence it can have on asynchronous discussions, is an interesting aspect for future
research.
We notice cases where it is difficult to explicitly define whether certain com-
munication is synchronous or asynchronous, such as instant messaging, where
people can drift in and out of synchronization constantly. It can be argued that
it is more fruitful to visualize communication based on delay, or the possible
delay, between exchange of information instead of using the binary categoriza-
tion. In an online message board a comment can be replied to immediately, or
in two days, or never. To truly synchronize asynchronous discussion, solutions
should be sought where this delay is minimized. This idea can be taken further
by placing different forms of communication on an axis based on how much delay
there is between exchange of ideas. This axis is displayed in Fig. 4. If the delay
in feedback is used as the sole defining feature of asynchronous communica-
tion compared to synchronous, then we arrive into the conclusion that there are
“more synchronous” activities than others. Thus, we can increase the synchrony
of asynchronous discussions.
Fig. 4. Sorting forms of communication based on the delay in feedback
5.3 Limitations
The empirical data collected in this study was from a specific course in a geo-
graphically limited area and used a specific technology (Google Docs) for organiz-
ing discussions. The instructions and behavior of the course facilitator influenced
the discussion activity. Furthermore, increasing the intrinsic motivation of par-
ticipants via, for example, giving them a concrete common goal which they had
to achieve and which required cooperation might have increased the discussion
activity.
All these limitations in mind, the purpose of the empirical data was to iden-
tify challenges which might arise in pure asynchronous communication. It is
likely the findings are present in other asynchronous online courses as well. Cur-
rently UNIPS courses have shown to have a positive impact on students’ learning
despite the challenges in the teamwork period [31]. It is thus possible that par-
ticipants learn also simply by viewing discussion instead of contributing to it
themselves, as suggested by Chiu and Hew [6].
5.4 Future Work

The findings from this study mostly focus on identifying a problem with small
group asynchronous discussions. The natural follow-up study would be to imple-
ment some of the proposed remedies in similar small group asynchronous discus-
sions and measure the effects it has on student engagement, participation and
learning. In terms of the proposed solutions, one of the interesting aspects is to
shape the used technology to better serve the discussions. In the case of Google
Docs, this could mean adding gamification elements to the mix such as awaring
points for commenting [11] or prompting participants a synchronous commu-
nication option if they happen to be online simultaneously. Furthermore, the
technology could alert students if they have received new comments and remind
them to go reply if they have not done so in a certain time window.
6 Conclusions
We used empirical data from group discussions during a UNIPS online pedagog-
ical course to identify three temporal issues in the asynchronous communication
that took place: (1) Discussion activity peaked around deadlines (2) Students
reserved time to write comments on days where there was nothing for them
to do and (3) students were unable to discuss synchronously even if they were
online at the same time. We theorize that these challenges could be mitigated if
participants synchronized their activities better with each other. As a solution,
the actions of the course facilitator, instructions given to participants and cho-
sen communication technologies should be looked into. We also discussed what
follows if activities are observed based on the delay between the exchange of
ideas, and used this to place activities traditionally categorized as asynchronous
or synchronous on a spectrum. Future work will include empirically testing the
effects the proposed solutions will have on the quality of the discussions and
consequently, on students’ learning.
References
1. An, H., Shin, S., Lim, K.: The effects of different instructor facilitation approaches
on students’ interactions during asynchronous online discussions. Comput. Educ.
53(3), 749–760 (2009)
2. Andresen, M.A.: Asynchronous discussion forums: success factors, outcomes,

assessments, and limitations. J. Educ. Technol. Soc. 12(1), 249–257 (2009)
3. Blair, B.L., Fletcher, A.C., Gaskin, E.R.: Cell phone decision making: adolescents’
perceptions of how and why they make the choice to text or call. Youth Soc. 47(3),
395–411 (2015)
4. Branon, R.F., Essex, C.: Synchronous and asynchronous communication tools in
distance education. TechTrends 45(1), 36–36 (2001)
5. Caspi, A., Gorsky, P., Chajut, E.: The influence of group size on nonmandatory
asynchronous instructional discussion groups. Internet High. Educ. 6(3), 227–240
(2003)
6. Chiu, T.K., Hew, T.K.: Factors influencing peer learning and performance in
MOOC asynchronous online discussion forum. Aust. J. Educ. Technol. 34(4)
(2018). https://doi.org/10.14742/ajet.3240
7. Clark, C., Strudler, N., Grove, K.: Comparing asynchronous and synchronous video
vs. text based discussions in an online teacher education course. Online Learn.
19(3), 48–69 (2015)
8. Day, P.N., Holt, P.O., Russell, G.T.: The cognitive effects of delayed visual feed-
back: working memory disruption while driving in virtual environments. In: Inter-
national Conference on Cognitive Technology, pp. 75–82. Springer (2001)
9. Den Otter, A., Emmitt, S.: Exploring effectiveness of team communication: balanc-
ing synchronous and asynchronous communication in design teams. Eng. Constr.
Archit. Manage. 14(5), 408–419 (2007)
10. Dennen, V.P.: From message posting to learning dialogues: factors affecting learner
participation in asynchronous discussion. Dist. Educ. 26(1), 127–148 (2005)
11. Ding, L., Kim, C., Orey, M.: Studies of student engagement in gamified online
discussions. Comput. Educ. 115, 126–142 (2017)
12. Graham, M., Dutton, W.H.: Society and the Internet: How Networks of Information
and Communication are Changing Our Lives. Oxford University Press, Oxford
(2019)
13. Hardaker, C.: Trolling in asynchronous computer-mediated communication: from
user discussions to academic definitions (2010)
14. Hrastinski, S.: Asynchronous and synchronous e-learning. Educ. Quart. 31(4), 51–
55 (2008)
15. Johnson, G.M.: Synchronous and asynchronous text-based CMC in educational
contexts: a review of recent research. TechTrends 50(4), 46–53 (2006)
16. Kay Wijekumar, K., Spielvogel, J.: Intelligent discussion boards: promoting deep
conversations in asynchronous discussion boards through synchronous support.
Campus-Wide Inf. Syst. 23(3), 221–232 (2006)
17. Laato., S., Lipponen., E., Salmento., H., Vilppu., H., Murtonen., M.: Minimizing
the number of dropouts in university pedagogy online courses. In: Proceedings of
the 11th International Conference on Computer Supported Education - Volume
1: CSEDU, pp. 587–596. INSTICC, SciTePress (2019). https://doi.org/10.5220/
0007686005870596
18. Laato, S., Pope, N.: A lightweight co-construction activity for teaching 21st cen-
tury skills at primary schools. In: Proceedings of the 52nd Hawaii International
Conference on System Sciences (2019)
19. Laato., S., Salmento., H., Murtonen., M.: Development of an online learning
platform for university pedagogical studies-case study. In: Proceedings of the
10th International Conference on Computer Supported Education - Volume 2:
CSEDU, pp. 481–488. INSTICC, SciTePress (2018). https://doi.org/10.5220/
0006700804810488
20. Latchman, H., Salzmann, C., Thottapilly, S., Bouzekri, H.: Hybrid asynchronous
and synchronous learning networks in distance education. In: International Con-
ference on Engineering Education (1998)
21. Li, L., Finley, J., Pitts, J., Guo, R.: Which is a better choice for student-faculty
interaction: synchronous or asynchronous communication? J. Technol. Res. 2, 1
(2011)
22. Mabrito, M.: A study of synchronous versus asynchronous collaboration in an
online business writing class. Am. J. Dist. Educ. 20(2), 93–107 (2006)
23. Madden, L., Jones, G., Childers, G.: Teacher education: modes of communication
within asynchronous and synchronous communication platforms. J. Classr. Inter-
act. 52(2), 16–30 (2017)
24. Murphy, E.: Recognising and promoting collaboration in an online asynchronous
discussion. Br. J. Educ. Technol. 35(4), 421–431 (2004)
25. Murphy, E., Rodrı́guez-Manzanares, M.A., Barbour, M.: Asynchronous and syn-
chronous online teaching: perspectives of canadian high school distance education
teachers. Br. J. Educ. Technol. 42(4), 583–591 (2011)
26. Offir, B., Lev, Y., Bezalel, R.: Surface and deep learning processes in distance
education: synchronous versus asynchronous systems. Comput. Educ. 51(3), 1172–
1183 (2008)
27. Oztok, M., Zingaro, D., Brett, C., Hewitt, J.: Exploring asynchronous and syn-
chronous tool use in online courses. Comput. Educ. 60(1), 87–94 (2013)
28. Peterson, A.T., Beymer, P.N., Putnam, R.T.: Synchronous and asynchronous dis-
cussions: effects on cooperation, belonging, and affect. Online Learn. 22(4), 7–25
(2018)
29. Rosenberg, J., Akcaoglu, M., Willet, K.B.S., Greenhalgh, S., Koehler, M.: A tale of
two twitters: synchronous and asynchronous use of the same hashtag. In: Society
for Information Technology and Teacher Education International Conference, pp.
283–286. Association for the Advancement of Computing in Education (AACE)
(2017)
30. Swan, K.: Virtual interaction: design factors affecting student satisfaction and per-
ceived learning in asynchronous online courses. Dist. Educ. 22(2), 306–331 (2001)
31. Vilppu, H., Södervik, I., Postareff, L., et al.: The effect of short online pedagogi-
cal training on university teachers’ interpretations of teaching-learning situations.
Instr. Sci. 47, 679–709 (2019). https://doi.org/10.1007/s11251-019-09496-z
32. Watts, L.: Synchronous and asynchronous communication in distance learning: a
review of the literature. Q. Rev. Dist. Educ. 17(1), 23 (2016)
33. Yamagata-Lynch, L.C.: Blending online asynchronous and synchronous learning.
Int. Rev. Res. Open Distrib. Learn. 15(2), 189–212 (2014)
Academic Dishonesty Prevention in E-learning
University System
Daria Bylieva(&) , Victoria Lobatyuk , Sergei Tolpygin ,

and Anna Rubtsova
Peter the Great St. Petersburg Polytechnic University (SPbPU),

Saint-Petersburg 195251, Russia
[email protected]
Abstract. This article shows the ways of prevention academic dishonesty in e-

learning. We reveal this phenomenon as a complex social notion in the context
of the sociology of communication and the Internet and educational data mining.
The main categories of academic dishonesty are considered in the paper:
cheating, plagiarism and collusion. Cheating is shown in details because it is
associated with the tests during the e-learning course. The authors carried out a
substantial analysis of academic dishonesty in e-learning and it’s prevention
using more than 50 examples covering the experience of various countries and
higher education systems. The most interesting case shows measures preventing
extension for Google Chrome PolyTestHelper. The main conclusion of the study
proves the priority of preventive measures. It is obvious that the problem of
cheating cannot be solved purely technically, since a technical war can last
forever. At the present stage of information and communication technologies
widespread usage a situation of prohibition or significant restrictions in e-
learning environment may seem contrived to students. False results of anti-cheat
programs can seriously undermine the trust between students and teachers. The
ideal way to prevent academic dishonesty is to perform tasks including creative
academic elements. Nevertheless, this way is not universal because these are
mass courses with a regular enrollment of several thousand students, with a fully
automatic verification of academic results.
Keywords: Digital academic dishonesty Academic integrity Higher

education e-assessment
1 Introduction
Development of information and communication technologies changes the life of a

person in many areas [1–4]. Education is one of the areas in which changes are
especially substantial [5–9]. E-learning is not just a new learning technology, but a new
education paradigm that changes the roles of lecturers and students. That is why
original skills and competences are required from students. One of the essential factors
of e-learning is the individualization of education by the means of virtual education
environment. The student interacts with the learning interface, which is located out of
university area in customary Internet space with a reduced social control.
226 D. Bylieva et al.
The problem of cheating common for the online environment remains unresolved.
Some studies confirm that academic dishonesty in e-learning is higher than in face-to-
face training. Watson and Sottile examined students of various specialties and noted
that in all cases they significantly more likely to obtain answers from other students
during an online test or quiz [10]. However, others, on the contrary, indicate a lower
level of cheating compared to face-to-face learning [11] or the absence of difference in
indicators [12]. We may assume that the academic dishonesty level is more influenced
by other factors than digital or traditional form of study.
It should be noted that the special academic temptation for students is this simple
way of academic dishonesty practiced in classes, unless special measures are taken to
prevent it. For instance, King, Guyette, and Piotrowski found out that for 73.6% of
surveyed students it was easier to cheat online [13].
Disappearing study in the library and intensive using of online resources have made
“copy-paste” operation the most regular activity when students are doing assignments.
They do it without realizing that plagiarism is a violation of academic regulations and
copyrights. Blau and Eshet-Alkalai’s study indicates that school students perceive
digital plagiarism and digital facilitation as legitimate behaviors [14]. Moreover, it
turns out that digital dishonesty entails less punishment, as analysis of Disciplinary
Committee’s protocols shows for a 4-years period in Israel, as being perceived as less
harmful and therefore receives lighter penalties [15]. Researchers note that academic
dishonesty is being ‘normalised’ as students rationalise some levels of academic dis-
honesty [16, 17].
2 Types and Technologies of Academic Dishonesty

in E-learning
In its most general form, traditionally academic dishonesty is divided into three cate-
gories: cheating, plagiarism, and collusion [18, 19]. Although the latter may resemble
quite acceptable collaboration as exchange of ideas. Though it implies joint actions
connected with active, intentional and obvious act of cheating [19]. Pavela identifies 4
main types of academic dishonesty:
– cheating – using learning materials, information, or other aids, whose use was
explicitly banned.
– plagiarism – use of content prepared by others and presenting it as one’s own,
without giving reference to the source.
– fabrication – invention or citation of non-existent information.
– facilitating academic dishonesty – intentionally helping someone else perpetrate
academic dishonesty [20].
Researchers indicate that plagiarism and helping others to conduct dishonest
activities were perceived in our study as more legitimate in the digital setting [16].
We will study cheating during the tests and examinations within the on-line courses
in more detail. The most striking example of misconduct is passing control by another
person instead of the student. In some situations, students do their assignments together
though it may be individual or group tasks and with the inclusive “expert”. In case of
Academic Dishonesty Prevention in E-learning University System 227
individual tests, it will be cheating if a student uses answers. Students may get these
answers by different ways: directly from those who have already passed the test (in the
current period or in the previous one if the test is not changed a lot). They may get it
indirectly by technical means, when the database is collected for example, by sharing
google-tables, social networks, etc. Then all data is collected in a convenient form for
searching the right answer and placed in the network by an initiator. However, there
may be simpler ways of testing when it is not required even such an easy intellectual
activity as finding and choosing the right question and answer. In St. Petersburg
Polytechnic University students developed a special extension for Google Chrome
PolyTestHelper, which operates for a specific list of sites. These sites contain tests to
control students’ knowledge. Correct answers are “highlighted” in such tests. The text
of the question is sent to the add-on server and the answer is received from the server
according to the students’ the most frequent choice. Thus, the supplement autono-
mously collects a database of questions and answers with special weighs - the number
of selected answers to a specific question. Correct answers are highlighted on the test
page (<<<) (Fig. 1) or the text of the answer is entered. In this case, the student may not
be bothered not only by finding an answer in the database or in the network, but even
by reading the question.
Fig. 1. Illustration of highlighting the correct answer with the PolyTestHelper extension
(original and translation)
Plagiarism in the university environment can be divided into two main types:
presentation someone else’s results as yours ‘Ghost writing’ (purchased, downloaded
on the network, taken from other students) and “classical plagiarism” (using sources
without references). The common university practice of verifying students’ papers at
the level of originality with the help of special services does not solve the problem
since there is constant renovation of ways changing the text so that it is perceived as
original. For example, by inserting invisible (written in white colour) additional letters
in words, it was possible to “trick” the anti-plagiarism service until 2016. In the text the
Russian letters were successfully replaced with Latin letters with the same spelling.
When the anti-plagiarism system was programmed to track this replacement, students
began to use similar Greek or Arabic letters. Now none of these methods works. Today
more sophisticated techniques of working with text are used. To get a text similar in
meaning to the existing one, but not perceptible for programs, students can translate the
text into another language, and then translate this text again into the original. More
complicated method involves creative processing of the text by using synonyms.
Automation of this process with the help of synonymizers is not yet effective enough,
as the readability of the text decreases. In addition, there is the possibility of inserting
into the text neutral adverbs, adjectives, interjections, prepositions, automatic
hyphenation, etc., as well as the shingle rule consideration used for verification.
3 Methods of Academic Dishonesty Prevention
The problem of academic dishonesty prevention remains unresolved. Du Plessis states

that institutions should use quality assurance mechanisms to protect the academic
integrity of institutions [21]. Other researchers believe that the academic dishonesty
problem can only be solved if people want to change and this change speaks as much to
notions of scholarship as it does to ethical scholarship [22].
It is obvious that the problem of cheating cannot be solved purely technically. The
technical war between those who come up with ways to cheat and who prevents it can
last a very long time. However, the lack of simple ways to prevent academic dishonesty
seems to be a pragmatic strategy for the university.
Leaving aside the ethical side of the issue (the moral standards of the university, the
policy of the teacher and the degree of familiarity) and acceptance of these standards by
students, prevention of academic dishonesty consists of two aspects:
– creating assignments for the course in such a way as to minimize the possibility of
academic misconduct,
– technical restrictions that block or impede academic dishonesty.
The forms and “weight coefficient” of various activities during the online course,
which perform the academic progress, can significantly affect the academic dishonesty
level. Some types of control require much more ingenuity for illegal behavior (for
example, online discussions). The ideal way to overcome academic dishonesty for all
types of control is to design the tasks without prepared answers or texts with the high
level of individualization. This way it is not enough just to have some information but
you have to convert this information creatively [23]. In this case, to produce correct
answer will be more difficult than just use some information. In general, nowadays we
see massive inclusion of information and communication technologies in human life.
That is why the situation of the prohibition or significant restrictions in e-learning
environment may seem to students far-fetched and not relevant to real situation. Some
teachers who took part in the study in Turkey and Bulgaria claim that they do not use
control tasks for homework and prefer oral, in-person exams [24].
Other researchers, in contrast, accentuate the need for a complete transition to e-
assessment as a corresponding form to realistic technological context [25]. Chuchalin
notes that one of the critical factors of the digital era is the ability to quickly and
efficiently process and analyze information for decision-making in conditions of
uncertainty [26]. It seems that ultimately, high-level university courses will offer
assignments for which the ability to use electronic resources by students in the exam
will not be an obstacle that must be painfully overcome, but a brilliant opportunity for
implementing an innovative educational strategy. The exam in the digital environment
should become more “dynamic, interactive, immersive, intelligent, authentic and
ubiquitous” [27]. Today, there are examples of Open-Book, Open-Web exam, where
the student is offered the contemporary problem of the real world submitted as a mini-
case, that requires the application of the skills, techniques and knowledge of the field
concerned [28].
However, such a solution is not often used for monitoring online mass courses.
Such tasks in the vast majority of cases exclude the possibility of automatic electronic
verification and require a lot of time to interact with students, which is usually not
intended for tutors and facilitators of online courses.
At the same time, such traditional tasks such as randomly selecting questions from
the database and mixing answers in multiple choice tests are easily solved by students
using the methods described in the previous paragraph.
Mass courses with a semester enrollment of several thousand students are espe-
cially vulnerable to “uncovering” the base of questions. Creating such a course as
absolutely invulnerable for students searching the answers together today seems to be a
technically difficult task.
New forms of cheating require strategic decisions from educational institutions. In
order to neutralize activities from extension Google Chrome PolyTestHelper The
Centre of Open Education of the St. Petersburg Polytechnic University took the fol-
lowing steps:
– For distance learning portals was developed web application for blocking page
changes when writing the correct answer. The launch of this application made it
possible to identify another class of “add-ons” useful for students - online trans-
lators (Google translate, Yandex translator), which can also be started when passing
a test in a foreign language with translation of tasks and used to simplify the test.
– Metrics of abnormal behavior of students during the course were developed:
copying and pasting from the clipboard, switching between tabs when passing the
test, launching add-ons for automatically passing the test or online translator with
collecting this kind of information on a local analytics server, similar to Yandex
metrika, Google Analytics.
– Collecting information about how much time the student spent on the test and on a
specific element in the course (the browser tab was active, not the data from the log)
to further block the test if the previous material was not viewed for a certain time.
– An analytics system was developed on the basis of the student’s actions log in the
course: analysis of the learning trajectory of a particular student (the topics he/she
opened depending on time, etc.).
Invasive technologies can be used to prevent academic dishonesty (such as
blocking the use of other software). Blocking IP address if the audience has a common
external IP address. Testing in safe browser mode when the test is configured to pass by
the means of a specific browser and with a specific key, and in the browser settings this
key is indicated (https://docs.moodle.org/37/en/Safe_exam_browser). It solves the
problem when one student starts the change in the class (and interrupts the attempt) and
the second student from the campus under the credentials of the first one continues the
attempt out of the class and sends the result to the server (Fig. 2).
Main types of academic dishonesty Prevention
use of sources anti-

without references plagiarism
creative individual
systems
tasks
with the use of infor-
mation
plagiarism
verification
‘Ghost writing’ / identifica- technical methods
tion making AD difficult
impersonation
cheating
invasive behavior
technologies analysis
use of unauthorized
materials or ready- abnormal behavior metrics
made answers blocking the use Analysis of student be-
of other software havior during the course
and change the by logs and computer
page activity on the screen
Fig. 2. Scheme of the main types of academic dishonesty and methods of prevention
Berry, Thornton and Baker offer to prevent cheating using sites like turn-it-in.com,
and submit the results with the assignment, helps to deter the propensities to digital
cheating. You may use other software like LAN School, that shows all computer
activity to determine if digital cheating is occurring, is a tool that is a major deterrent to
online cheating in the classroom [29]. Peter the Great St. Petersburg Polytechnic
University (SPbPU) uses the danware netop school program which allows you to
monitor the desktops of students in one or several classes, connect to a specific screen
and drive the mouse instead of a student.
However, the ability to fully control all the activities occurring on the computer
screen does not guarantee that the student is tested independently. Leaving aside the
possibility of substituting an examiner or taking a test together, we know that each
student usually has more than one device connected to the Internet. When the
awareness of students about “tracking” becomes high, students take a test on a com-
puter screen and at the same time get information from the database on a smartphone or
tablet screen.
The technical complexity of the exam process eliminates the possibility of
impersonation and significantly reduces the possibility of using extra information
sources (for example, visual identification (Webcam), secure remote proctor software
using biometric verification (uni-modal or bimodal biometric), from using common
webcam, microphone up to special devices (digital fingerprint readers or high-
definition cameras for iris recognition). In addition, to control the actions of students,
web cameras can be used to view the workplace and the entire class for 360° before
starting the test, the position of hands and eye direction during the whole test.
Proctoring brings closer the situation of the online exam as the traditional class-
room testing (regardless of whether it is electronic or happening in the classroom) with
all the usual methods of cheating like digital (with communication tools) and traditional
(cheat sheets, tips, etc.). SPbPU has offline proctoring for the national portal Openedu.
ru for manual activation of student sessions when passing the test. The teacher
approves the launch of the test for those students who are in the classroom and rejects
attempts by those students who are at home. It allows you to get away from the lists of
rules and restrictions when passing the test with a large number of groups and a long
session time.
Companies issuing professional certificates have long been faced with the need for
strict control over the passing of exams. So at the Pearson VUE tests there are the
following set of preventive measures: 1) the student is identified by two documents, 2)
the student puts everything out of pocket before the exam; 3) a camera is placed above
the student and monitors the hands (that should be above the table), knees (there should
be no cheat sheets), 4) the test center employee monitors the cameras and there is a
recording for VUE of student’s behavior and operator’s activity too, 5) the student
passes a test on a PC on which there is only software for testing and there is no way to
install another software. At the same time, it is obvious that measures acceptable for
one-time testing at a certification center can be destructive for relations in university
environment, which implies mutual respect and trust. It really contributes to a signif-
icant increase of stress level.
Specifically designed for the academic environment application project TeSLA (an
Adaptive Trust-based e-assessment System for Learning) is the use of a system for
student authentication and authorship checking integrated within institutional Virtual
Learning Environment [24, 30]. Imbedded in Learning management system (for
example, Moodle) plug-in directly integrated into the most used assessment activities
such as assignment, forum and quiz help the teacher to choose the authentication forms
necessary for the special case - Face Recognition, Voice Recognition and Keystroke
Dynamics (for typing rhythm), and to check authorship, including Forensic Analysis
(for writing style) and Plagiarism Detection. However, even if Face and Voice
Recognition are not completely reliable today, the plug-in innovations like Keystroke
Dynamics and Forensic Analysis raise the question of how much you can rely on
technical means for detecting cheating. False positives of anti-cheating programs can
seriously undermine the trust that is important for relations between students and
teachers. Today formal reliance on anti-plagiarism indicators sometimes leads to
disappointing consequences, when smart cheaters get the highest score, and the inde-
pendently completed work is rejected.
4 Conclusion and Discussion
Nowadays e-learning is well developed. It is obvious that e-educational technologies

will be much more popular. This fact should be taken into account now in order to
combine the objective needs of educational organizations to increase the effectiveness
of new educational technologies and methods, including blended learning and online
courses and minimize negative consequences such as academic dishonesty. It is worth
noting that e-learning is a challenge for students developing self-organization, maturity,
ethical principles and other personal qualities. Studying in the digital environment
without physical presence of the teacher it is very difficult to abandon the temptation to
use any types of academic dishonesty. How can one deal with this phenomenon and is
it possible to minimize it? Is it worth to arrange total surveillance of students during e-
learning? What methods may be appropriate and ethical in monitoring academic
progress?
At first glance, the simplest solution to this problem is not technological control but
creativity in learning. Creative tasks should not contain typical answers. This way it is
not enough to have some knowledge, you should produce original educational product.
There are many e-learning courses where thousands of students practically can not go
through personal control of a teacher doing different creative assignments. In this case
using algorithms is really necessary. Applying such mass courses in the universities
you should plan control functions. That is why it is important to design its stages and
content. Is it necessary that a student feels strict punitive measures and is afraid of
them? Compare training with traffic. Each participant knows the traffic rules and is
aware of cameras monitoring him, but this background process does not interfere with
the main one. This is exactly what should happen in the prevention of academic
dishonesty. The control process should not prevail over the educational process. The
main technical methods of prevention include standardized verification/identification,
invasive technologies, blocking extraneous programs/applications, anti-plagiarism
systems, background monitoring of behavior, logs and computer activity on the screen
and metrics of abnormal behavior. It is also necessary to constantly monitor innova-
tions in methods cheating and such applications. It is important to introduce ethical
standards, to form the trend of doing a task independently what is beneficial to student
himself.
References
1. Shipunova, O., Evseeva, L., Pozdeeva, E., Evseev, V.V., Zhabenko, I.: Social and
educational environment modeling in future vision: infosphere tools. In: E3S Web of
Conference, vol. 110, p. 02011 (2019). https://doi.org/10.1051/e3sconf/201911002011
2. Shipunova, O.D., Berezovskaya, I.P., Mureyko, L.M., Evseeva, L.I., Evseev, V.V.: Personal
intellectual potential in the e-culture conditions. Espacios 39, 15 (2018)
3. Pokrovskaia, N., Khansuvarova, T., Khansuvarov, R.: Network decentralized regulation

with the fog-edge computing and blockchain for business development. In: 14th European
Conference on Management, Leadership and Governance (ECMLG 2018), pp. 205–212
(2018)
4. Shestakova, I.G.: To the question of the limits of progress: is singularity possible? Vestn.
Saint Petersbg. Univ. Philos. Confl. Stud. 34 (2018). https://doi.org/10.21638/11701/spbu17.
2018.307
5. Pokrovskaia, N.N., Petrov, M.A., Gridneva, M.A.: Diagnostics of professional competencies
and motivation of the engineer in the knowledge economy. In: 2018 Third International
Conference on Human Factors in Complex Technical Systems and Environments (ERGO)s
and Environments (ERGO), pp. 28–31. IEEE (2018). https://doi.org/10.1109/ERGO.2018.
8443851
6. Razinkina, E., Pankova, L., Trostinskaya, I., Pozdeeva, E., Evseeva, L., Tanova, A.: Student
satisfaction as an element of education quality monitoring in innovative higher education
institution. In: E3S Web of Conference, vol. 33, p. 03043 (2018). https://doi.org/10.1051/
e3sconf/20183303043
7. Razinkina, E., Pankova, L., Trostinskaya, I., Pozdeeva, E., Evseeva, L., Tanova, A.:
Influence of the educational environment on students’ managerial competence. In: E3S Web
of Conference, vol. 110, p. 02097 (2019). https://doi.org/10.1051/e3sconf/201911002097
8. Baranova, T., Khalyapina, L., Kobicheva, A., Tokareva, E.: Evaluation of students’
engagement in integrated learning model in a blended environment. Educ. Sci. 9, 138
(2019). https://doi.org/10.3390/educsci9020138
9. Almazova, N., Andreeva, S., Khalyapina, L.: The integration of online and offline education
in the system of students’ preparation for global academic mobility. In: Alexandrov, D.,
Kabanov, Y., Koltsova, O., Boukhanovsky, A., Chugunov, A.V. (eds.) Communications in
Computer and Information Science, vol. 859. pp. 162–174. Springer, Cham (2018). https://
doi.org/10.1007/978-3-030-02846-6_13
10. Watson, G., Sottile, J.: Cheating in the digital age: do students cheat more in online courses?
Online J. Distance Learn. Adm. 13 (2010)
11. Peled, Y., Eshet, Y., Barczyk, C., Grinautski, K.: Predictors of academic dishonesty among
undergraduate students in online and face-to-face courses. Comput. Educ. 131, 49–59
(2019). https://doi.org/10.1016/J.COMPEDU.2018.05.012
12. Grijalva, T.C., Nowell, C., Kerkvliet, J.: Academic honesty and online courses. Coll. Stud.
J. 40, 180–185 (2006)
13. King, C.G., Guyette, R.W., Piotrowski, C.: Online exams and cheating: an empirical analysis
of business students’ views. J. Educ. Online 6 (2009)
14. Blau, I., Eshet-Alkalai, Y.: Cheating through digital technologies from the perspective of
Israeli students, teachers and parents – patterns and coping strategies (2015). (in Hebrew)
15. Etgar, S., Blau, I., Eshet-Alkalai, Y.: White-collar crime in academia: trends in digital
academic dishonesty over time and their effect on penalty severity. Comput. Educ. 141,
103621 (2019). https://doi.org/10.1016/J.COMPEDU.2019.103621
16. Blau, I., Eshet-Alkalai, Y.: The ethical dissonance in digital and non-digital learning
environments: does technology promotes cheating among middle school students? Comput.
Hum. Behav. 73, 629–637 (2017). https://doi.org/10.1016/j.chb.2017.03.074
17. Lathrop, A., Foss, K.: Student cheating and plagiarism in the Internet era : a wake-up call.
Libraries Unlimited (2000)
18. Khawlah, A.: Student perceptions of academic dishonesty in a private middle eastern
university. High. Learn. Res. Commun. 8 (2018). https://doi.org/10.18870/hlrc.v8i1.400
19. Hosny, M., Fatima, S.: Attitude of students towards cheating and plagiarism: university case
study. J. Appl. Sci. 14, 748–757 (2014). https://doi.org/10.3923/jas.2014.748.757
20. Pavela, G.: Applying the power of association on campus: a model code of academic
integrity. J. Bus. Ethics 16, 97–118 (1997)
21. Du Plessis, I.: E-learning as the cornerstone to academic integrity in the 21st century. In:
Unpublished Paper Presented at the Council on Higher Education Quality Promotion
Conference: Promoting Academic Integrity in Higher Education, 26–28 February. CSIR
International Convention Centre, Pretoria (2019)
22. Waghid, Y., Davids, N.: On the polemic of academic integrity in higher education. S. Afr.
J. High. Educ. 33, 1–5 (2019). https://doi.org/10.20853/33-1-3402
23. Hong, J.-C., Ye, J.-H., Fan, J.-Y.: STEM in fashion design: the roles of creative self-efficacy
and epistemic curiosity in creative performance. Eurasia J. Math. Sci. Technol. Educ. 15
(2019). https://doi.org/10.29333/ejmste/108455
24. Mellar, H., Peytcheva-Forsyth, R., Kocdar, S., Karadeniz, A., Yovkova, B.: Addressing
cheating in e-assessment using student authentication and authorship checking systems:
teachers’ perspectives. Int. J. Educ. Integr. 14, 2 (2018). https://doi.org/10.1007/s40979-018-
0025-x
25. Berisha, E., Trindade, R.T., Bürgi, P.Y., Benkacem, O., Moccozet, L.: A versatile and
flexible e-assessment framework towards more authentic summative examinations in higher-
education. Int. J. Contin. Eng. Educ. Life-Long Learn. 29, 1 (2019). https://doi.org/10.1504/
IJCEELL.2019.10019538
26. Chuchalin, A.I.: Engineering education in the epoch of industrial revolution and digital
economy. High. Educ. Russ. 27, 47–62 (2018). https://doi.org/10.31992/0869-3617-2018-
27-10-47-62
27. Guàrdia, L., Crisp, G., Alsina, I.: Trends and challenges of e-assessment to enhance student
learning in higher education. In: Innovative Practices for Higher Education Assessment and
Measurement, pp. 36–56 (2017). https://doi.org/10.4018/978-1-5225-0531-0.ch003
28. Williams, J.B., Wong, A.: Closed book, invigilated exams versus open book, open web
exams: an empirical analysis. In: ASCILITE-Australian Society for Computers in Learning
in Tertiary Education Annual Conference, pp. 1079–1083. Australasian Society for
Computers in Learning in Tertiary Education (2007)
29. Berry, P., Thornton, B., Baker, R.K.: Demographics of digital cheating: who cheats, and
what we can do about it! In: Proceedings of the 2006 Southern Association for Information
Systems Conference, pp. 82–87 (2006)
30. Noguera, I., Guerrero-Roldán, A.-E., Rodríguez, M.E.: Assuring Authorship and Authen-
tication Across the e-Assessment Process (2017). https://doi.org/10.1007/978-3-319-57744-
9_8
Curriculum for Digital Culture at ITMO
University
Elena Mikhailova(&) , Anton Boitsev, Olga Egorova,

Natalia Grafeeva , Aleksei Romanov, and Dmitriy Volchek
ITMO University, St. Petersburg, Russia

[email protected]
Abstract. In many scopes of human activities, both personal and professional,

new digital tasks are emerging every day. The new educational module “Digital
Culture” at ITMO University aims at building digital competencies of university
students, both bachelors and masters, that is to teach students the existing
approaches and methods of tasks solving, as well as to show them data pro-
cessing and analysis techniques to implement in the future in their professional
fields. Apart from conducting data analysis, the graduates are supposed to know
how to interpret the obtained results correctly and how to navigate in the digital
space easily. The disciplines of the first two years of study have been already
introduced in the educational process of ITMO University both in bachelor’s
and master’s degree programmes. This paper presents the curriculum content,
the learning method and the results of the first years of the educational module.
The disciplines of the module received positive feedback from both students and
external experts in the intercollegiate seminars on digital education.
Keywords: Digital culture Blended learning E-learning Data science

Statistical learning Machine learning Artificial intelligence
1 Introduction
1.1 Digital Economy

Until recently, the companies have managed the traditional types of assets, namely, the
physical and the intellectual property, and the money. However, the increasing dom-
inance of service sector over the production has led to the fact that information tech-
nologies begun to play a key role in the economy, simplifying the possibility of a
person to receive any service. The emergence of the Internet and the easier Internet
access instigated a digital revolution that has unleashed the changes both in the life of a
person and in the economy as a whole.
The concept of digital economy has appeared at the end of the 20th century in the
USA. In 1995 the director of the MIT Media Labs Nicholas Negroponte formulated the
concept of digital economy [6], where the production, processing, storage, transfer and
use of the constantly increasing digital data volumes become the dominating factors.
The information volumes increase every day, therefore data processing helps
to create new social services and to implement innovations in production and
236 E. Mikhailova et al.
management spheres. Big data became a growth driver and a new resource in the
economy. A popular phrase “data is the new oil” formulated by Clive Humby, the
British mathematician and the founder of the Tesco, means that data is a raw material
for economic growth. However, just like oil, raw data needs to be processed before
further application.
Information technologies have penetrated all activities of a modern human, i.e.
production, science, policy, commerce, everyday life, communications and culture. Our
future and already our present are the internet of things, blockchain technology and
distributed networking, production automation and robot economy, smart houses and
digital vision. Nowadays information technologies mean more than just using a
computer to perform tasks that traditionally have been done manually. Organizations
and individuals aim at executing their tasks better, faster and often in the other way
than in the past.
The digitalization of the society causes changes in the labour market as well. By the
year 2020 two million of jobs will be added to the global market, but at the same time
about 7 million of working places will disappear. The jobs will open in the intellectual
and high-tech fields and will be reduced in the real economy and administrative sectors.
By 2020 the Big Data technologies will increase employment, e.g. in the field of
mathematics and computer technology by 4.6%, in management by 1.3%, and in sales
by 1.25% per year, but will reduce the number of office staff workplaces by 6.06% [1].
The Russian government has adopted the programme named “Digital Economy in
the Russian Federation”, which is aimed at forming a full-fledged digital environment
in Russia [9]. Building a new economic structure based on developing digital economy
makes new demands on the vocational education system. The future specialists require
skills to apply and moreover to develop modern and secure digital technology and
platform solutions in the most important sectors of the economy and in the social
sphere.
1.2 Digital Culture

The process of studying basic vocational educational programmes in higher school can
be considered as mastering of the culture in a broad sense [8].
The culture includes the following:
• the objective results of human activity (the results of physical activity, such as,
different devices, systems and technologies; and the results of cognition, namely,
different kinds of books, knowledge bases, norms of law, etc.). These results could
be considered as the first objective component of culture;
• the subjective results and abilities of students, expressed in knowledge, skills,
abilities, competences, etc. that form the second subjective component of culture.
The subjective component of culture is formed during the learning process based on
the objective one, primarily, on the basis of social knowledge that is recognized as
relevant and significant for the social and economic development of the society. Tra-
ditionally, in higher education the forms of social consciousness (knowledge) include
such components of culture as language, science, morality, philosophy, law, art etc. [8].
Curriculum for Digital Culture at ITMO University 237
On one hand, digital culture determines the content of student’s education or

training such as knowledge, software, framework and processing techniques. On the
other hand, it determines the requirements for results of the education, namely,
acquired skills, experience, mindset, competencies, that are necessary for the social and
professional activities of the graduates in the information society and digital economy.
Digital culture is a phenomenon that largely determines the lifestyle, motivation,
ways and forms of communication and human behavior. A professional who has
necessary skills in the field of digital culture knows how to use the tools offered by the
modern information technologies, even if he/she has not been specialized in the field of
IT. Moreover, digital culture implies that a person observes the so-called digital ethics.
The concept of digital culture is broader than the concept of digital literacy [7], as
unlike digital literacy, it includes a worldview as a component that regulates the most
important types of life activity and human behavior based on beliefs and values.
1.3 The Structure of Disciplines’ Cluster “Digital Culture”

In order to correspond to the modern trends in the education and to meet the
requirements of digital economy, a cluster of disciplines named “Digital Culture” has
been developed at ITMO University that aims at building not only universal compe-
tences in the field of digital culture that are necessary for every person, but also
professional competences both general and industry-specific with regard to the chosen
subject area of students.
These subject areas could be very different. Not only software developers, but also
their managers need some knowledge in information technologies to set the objectives
and the task goals correctly to provide the developers with a correct technical speci-
fication. Statistical analysis and big data processing skills are important in any job
related to the economy. Communication professionals are required to work with gra-
phic editors, content management systems, HTML markup and other digital tools. The
philologists need to be able to analyze texts and speech; bioinformatics engineers need
to process DNA sequences. Taking into account different background and initial level
of IT experience of students, the disciplines’ cluster is aimed not at teaching everyone
to code, but at providing students with skills and knowledge that will help them in the
future to set tasks in their professional activity and to interpret the results correctly.
The skills and competencies are built by means of both the content of the disci-
plines and the study technology. Since the number of students at ITMO University
exceeds 3000 persons per year, a blended learning method has been chosen. In the
context of digital society development, a blended learning method is considered as an
educational approach that combines offline and online learning, with the second
component taking up most of the time [10] of the blended learning is based not only on
the self-study of online courses, but also on the contact work of a student and the
teacher, as well as on the collaboration of students [5]. With this approach it becomes
possible to use the advantages of both offline and online learning and to correct the
shortcomings of them [3, 4, 11]. The concept of the blended learning implies the ability
to change the ratio of online and offline interactions to make the learning process the
most effective and to create more comfortable conditions for interacting parties [2].
Bachelor’s students study the disciplines of the module for 6 terms, while for
master’s students the module is shorter and takes 2 terms. The structure of the disci-
plines cluster divided by terms for bachelor’s degree programme is shown in Fig. 1.
II
III
IV
V,
VI
Fig. 1. The structure of discipline’s cluster “Digital culture” for bachelor’s degree programme.
Obligatory courses are marked with the solid line, elective courses (i.e. students must select one
course each term) are marked with the dashed line. The term numbers are shown in the right
column.
Bachelor’s begin their studying with the discipline “Introduction to Digital culture”
consisting of three sections. In the first “fundamental” section, the following basic
concepts are revealed: computer and operating system architecture, coding technolo-
gies, network technologies, information security, Internet and web technologies.
The second section is devoted to the personal information and the interaction of a
human with digital technologies. The questions of digital ethics, Internet communi-
cations culture, personal security and blockchain technologies are considered: how to
communicate effectively with other users and organizations, how to present informa-
tion about yourself correctly, what data is public and what is private, how to ensure
information security, what legislation exists in the field of data management in Russia
and other countries.
The lectures in the third service section are devoted to the modern technology
achievements in the field of information technology, such as: virtual, augmented and
mixed reality, quantum technology, digital humanities, social networks and biblio-
graphic retrieval.
The next three terms of the bachelor’s programme are devoted to the disciplines
forming the core of the cluster. They develop universal competences of data processing.
We need to know how to deal with lots of information that we face with every day. It is
easy to get lost in the data flow if you lack skills to structure and visualize the data. The
goal of the first core discipline named “Data storage and processing” is to show practical
techniques and methods to store, process and analyze large amounts of data. The practical
assignments of this discipline are to be made by means of MS Excel, coding (not nec-
essary), relational and NoSQL database management systems. The discipline starts with
the description of data sources and data types, types and harmonization of measurement
scales, data cleaning and normalization methods, time series smoothing, then proceeds to
data visualization types. Later, structured and unstructured data storage and processing by
means of relational and NoSQL database management systems are discussed, such as
tables processing, queries building and optimization.
The second discipline is called “Applied statistics”. It explains the basic concepts of
statistics and applied statistics techniques in simple words. The goal of this discipline is
to teach students how to apply statistical methods to their vocational tasks and chal-
lenges and to how interpret the obtained results correctly. Among the discussed topics
are the point and interval data estimates, sample and distribution characteristics,
hypotheses testing and goodness of fit criteria.
The third discipline is closely related to statistics and examines various types of
machine learning and complex data analysis methods. The comparative advantages and
drawbacks of different methods, the underlying mathematics as well as various applied
task examples are given. Students learn the main approaches needed for big data
processing, such as modern regression and classification methods, data structure search,
outcomes analysis and basics of Python programming. Two learning paths (basic and
advanced) are developed for both “Applied statistics” and “Machine learning” disci-
plines, so that students can choose a path depending on their background. The dif-
ference between two learning paths is in software tools used to carry out practical
assignments of the course (either Python or MS Excel and MS Azure).
For the last year of bachelor’s degree programme, several electives are prepared
focused on the implementation of the studied methods to solving professional tasks in
different areas. Among these tasks the following topics are considered: queuing theory,
image processing, technical computing systems, Internet of things etc.
As the initial background of master’s degree students and their experience in IT
field differ considerably, and they have only two terms to study, two learning paths
have been developed for them as well (see Fig. 2). The discipline of every term
consists of two courses for both learning paths. The first term starts with the course
“Initial data storage and processing”, which is mandatory for all students regardless of
the chosen learning path. The course discusses initial data processing, data visualiza-
tion, as well as large volume data storage and processing by means of relational DBMS
and NoSQL systems. Afterwards students can take a recommendation testing to assess
their knowledge in statistics that helps them to choose the further learning path.
All students must take the course “Introduction to Machine Learning”, and depending
on their knowledge of statistics they can either choose the course “Advanced machine
learning” afterwards or take the course named “Elements of statistical data analysis”
prior to “Introduction to Machine Learning”, in this case skipping “Advanced Machine
Learning”.
Fig. 2. The structure of disciplines’ cluster “Digital culture” for master’s degree programme.
Obligatory courses are marked with the solid line, elective courses (i.e. students must select two
courses each term) are marked with the dashed line. The term numbers are shown in the right
column.
The course “Elements of statistical data analysis” describes the main statistical
methods, such as the point and interval estimates, confidence intervals, statistical
hypothesis testing. This course is introduced by the necessary basics concepts of the
probability theory.
The course “Introduction to Machine learning” is devoted to machine learning
types and applied tasks challenges solved by means of machine learning methods. The
main attention is paid to regression types, classification techniques, data clustering
tasks, and to the comparative analysis of different approaches. The students who took
the basic learning path take this course in the second term.
In the second term master’s students take the discipline named “Applied artificial
intelligence”. This discipline consists also of two courses. In the advanced learning
path, the discipline includes the course named “Advanced machine learning”.
This course discusses the factor analysis methods, variable reduction problems, mul-
ticlass regression and reinforcement learning methods.
Three elective courses of the discipline “Applied artificial intelligence” give stu-
dents the overview of the artificial intelligence application in different scopes.
The course named “Artificial intelligence in science and business” shows the
application of IT achievements in the information security, production automatization,
speech synthesis and recognition as well as knowledge graphs.
The course named “Text processing” is devoted to natural language processing
tasks. It discusses the information retrieval, language modelling, thesauruses and
ontologies as well as machine translation.
“Image processing” course considers the artificial vision and the basics of image
processing, neural network application in artificial vision tasks, faces and gestures
recognition and objects detection.
1.4 Implementation of the Courses

As mentioned above, the disciplines were implemented using the blended learning
method (see Fig. 3). In this case the most lectures and practical assignments are
implemented online, so students can assess them any time that is suitable for them. The
practical assignments are parametrized and checked automatically at the platform.
Fig. 3. The implementation of the blended learning method in the disciplines’ cluster “Digital
Culture”.
The traditional learning part includes introductory lectures, consultations (work-

shops) and masterclasses, and the final exam at the university. Besides, students can ask
the teacher a question by e-mail or in online discussion (forum).
In order to understand what lecture format is the most relevant and interesting for
the students, a survey has been carried out among ITMO University undergraduates. Its
results showed that bachelor’s degree programme students prefer to watch the lectures
in a cartoon style, while master’s degree programme students prefer animated pre-
sentations. The survey confirmed that nobody likes “talking heads”, so the lectures
were filmed in different forms for bachelors and for masters.
Each lecture consists of several small video parts not exceeding 10 min. Links to
the additional materials were included below the video parts, such as links to some
textbooks, interesting papers, other online courses on the topic, complimentary infor-
mation on the subject etc. Students’ perception preferences have been taken into
account; therefore all lecture materials are also provided in the text format or in the
form of MS PowerPoint presentation.
Ungraded quizzes which follow each video part help students to check their
understanding level. At the end of each lecture all students must make scored practical
assignments using the methods and techniques discussed in the lecture. Digital tech-
nology made it possible to implement parameterized exercises with the automated
checking. Some exercises could be made in different ways depending on a student’s
background and IT experience, either using basic graphic packages and MS Office, or
using Python scripting. Thus, not only the knowledge of a specific material is verified,
but the students’ skills and abilities of research work with modern software are formed.
Four to six lectures are combined into a course and after each course students must
make a more extensive final task also online. The final test tasks help to acquire a
holistic view of the material studied and to consolidate the skills.
Based on the results of all assignments and final tasks, students receive a certain
number of points that contribute to their final grade. In addition to the online assign-
ments, for some disciplines an in-person (offline) test in a computer class at the uni-
versity is scheduled in order to check the level of students’ knowledge and to verify
that students made all online assignments themselves and there was no cheating.
As there are some educational programmes in English and there are also many
international students not speaking Russian at ITMO University, for master’s degree
programme students all disciplines were implemented both in Russian and in English.
1.5 Results
The cluster of disciplines “Digital Culture” was launched in September 2018. Almost
4000 students (both bachelors and masters) are enrolled at ITMO University every
year. All of them were subscribed for the disciplines’ cluster “Digital culture”.
At the end of the first term in 2018 the students that have studied the disciplines of
the cluster were interviewed. About 16% of masters and 24% of bachelors took part in
this survey. Most of students believe that the courses will help them in their future
professional activity and rate the quality of course materials as high or good.
Most of those students who contacted technical support were satisfied with the
request results, while 40% of masters and 62% of bachelors had no need to contact the
support at all. 64% of the surveyed masters and 72% of bachelors have not contacted
teachers for clarification, while the other students mostly used e-mail to communicate
with the teacher, and the next most popular form of contact was the online forum where
a student can get an answer to the questions from both the teacher and his fellow mates.
Most undergraduates believe that the course is organized logically and consistently,
and the information given is clear and well supported by examples. However, for the
students specialized in less “technical” fields, whose focus area is management,
international relations, or biotechnology, the material seemed to be rather difficult to
master. They have found both theoretical material and practical tasks difficult. That’s
why the basic learning path has been introduced in this (2019–2020) year of study. The
statistics of the last year showed that the most students coped with the disciplines
excellently (i.e. got the grade “5”).
The new topics and information are supplemented to the courses every year, as well
as new tasks are elaborated. The Higher school of Digital Culture works closely with
the heads of educational programmes of ITMO University in order to understand the
needs of students from different specialties and to elaborate relevant and important
courses for the third year of bachelor’s degree programme.
We also collaborate with other Higher Schools and companies in Russia and
abroad, as all the methods described in the courses are aimed at practical implemen-
tation and the cluster of the disciplines can be provided to other listeners divided into
different courses depending on their initial interest. In particular, the course “Intro-
duction to Digital Culture” has been already provided to the students of Ural Federal
University, as well as to the secondary school teachers of Russian schools in the CIS
countries.
We see that the new educational approach is approved by the students’ attitude and
their results. We are planning to increase the number of elective courses to show the
students how digital technology and data processing methods are used in different
scopes of science and business. The learning paths are planned to be more detailed
allowing students to choose the needed path according to their experience and future
requirements.
References
1. Andreyeva, G.N., Badalyanc, S.V., Bogatyreva, T.G., et al.: The Development of the Digital
Economy in Russia as a Key Factor in Economic Growth and Population Life Quality
Improving. Professional Science, Novgorod (2018)
2. Barnard, L., Lan, W.Y., To, Y.M., Paton, V.O., Lai, S.: Measuring self-regulation in online
and blended learning environments. Internet High. Educ. 12(1), 1–6 (2009)
3. Bersin & Associates: Blended Learning: What Works? An Industry Study of the Strategy,
Implementation, and Impact of Blended Learning. Bersin & Associates, Oakland (2003)
4. Bonk, C.J., Graham, C.R.: Handbook of Blended Learning: Global Perspectives, Local
Designs. Pfeiffer Publishing, San Francisco (2006)
5. Dziuban, C., Graham, C.R., Moskal, P., Norberg, A., Sicilia, N.: Blended learning: the new
normal and emerging technologies. Int. J. Educ. Technol. High. Educ. 15(3) (2018). http://
doi.org/10.1186/s41239-017-0087-5
6. http://web.media.mit.edu/*nicholas/Wired/WIRED3-02.html
7. Mikhailova E.G.: The Cluster of Disciplines “Digital Culture” in the Educational

Programmes of Undergraduate and Graduate Programmes at ITMO University. Modern
Education: Content, Technologies, Quality, vol. 1, pp. 98–100. St. Petersburg State
Electrotechnical University ETU (2018)
8. Novikov, A.M.: The Basics of Pedagogics. Egves, Moscow (2010)
9. Programme “Digital Economy of Russian Federation”, approved by the Order of Russian
Federation Government from 28.07.2017 №1632-PC (2017)
10. Reay, J.: Blended learning—a fusion for the future. Knowl. Manag. Rev. 4(3), 6 (2001)
11. Staker, H., Horn, M.B.: Classifying K-12 Blended Learning. Innosight Institute, Christensen
Institute, Mountain View (2012). www.christenseninstitute.org
ICT Impact in Orientation and University
Tutoring According to Students Opinion
Antonio Pantoja Vallejo , Beatriz Berrios Aguayo(&) ,

and María Jesús Yolanda Colmenero Ruiz
University of Jaen, Jaen, Spain

{apantoja,bberrios,mjruiz}@ujaen.es
Abstract. Information and communication technologies are increasingly pre-

sent in university orientation and tutoring. The objective of this study is to
determine the needs regarding the use of ICT by teachers perceived by the
student. A mixed study was developed to 2779 students for the quantitative
study and 51 for the qualitative study of different European universities. A scale
created for the project that includes this research was used as well as the real-
ization of different discussion groups. The results of both analyzes detect that
university students demand more ICT skills from teachers as well as more direct
virtual contact through different networks. In conclusion, an extra effort of the
teaching staff is required by the student regarding the use of ICT in guidance and
tutoring.
Keywords: ICT Orientation University
1 Introduction
This article is framed within the Timonel R&D Excellence Project (Ref. EDU2016-
75892-P) and focuses on the elaboration of a Recommendation System based on the
orientation and mentoring needs of undergraduates and graduates, focusing on their
academic, personal, professional and ICT orientation. As objectives prior to its cre-
ation, the analysis of the needs of students and teachers in relation to the orientation and
tutorial function in European universities, the perception of the factors and elements
that determined the quality of the tutorial action plan was established according to the
teaching staff, as well as detecting good practices by tutorial action plan established
since the first decade of 2000 in European universities (Álvarez González 2012).
Tutoring is currently a potential strategic factor for the quality of the educational
model of the EHEA, since through its actions you can achieve an improvement in the
processes of access and adaptation of students, the optimization of the training process,
the prevention of abandonment of studies and improvement in professional develop-
ment processes (Álvarez 2012; Pantoja and Campoy 2005). A good tutoring practice in
the university implies a new learning approach, in which the tutorial practice becomes
an element of teaching quality and an essential requirement to respond to the demands
of university students (Durán and Estay-Nicular 2016; Pantoja 2005).
246 A. Pantoja Vallejo et al.
The tutoring is a truthful, confidential, personal, academic and professional

accompaniment process and must be in an interactive, collaborative, democratic and
non-directive context, there must be a true dialogue. Taking into account the new
demands of the current university, that is where ICTs come in, as an essential part of
the tutoring process, to achieve quality and successful learning (Fernández-Salinero,
González and Berlando 2017).
The 21st century university considers ICTs and university tutoring as quality fac-
tors and contemplates both elements in its systems of internal warranty In need of
flexibility of Higher Education, it is important to resume university tutoring as a
complement to the teaching function in its different modalities (personal, group and
virtual) and with an anticipatory and comprehensive character.
The impact of ICT on education and tutoring depends on the context of its use and
the purpose for which they are used and on the effective application made of them by
teachers and students in the university context. It is these factors that ultimately
determine the greater or lesser impact of ICT in tutorial practice and its greater or lesser
capacity to transform teaching and improve learning (Coll and Moreno 2008).
Virtual tutorials have gained more and more value. Zambrano and Zambrano
(2019) consider that the methodological strategies always used by students are indi-
vidual work, debates or discussion forum. The methodological strategies sometimes
used for the respondents are: face-to-face tutoring; individual tutorial and student
exhibition; master lesson and individual tutoring. Martínez, Pérez and Martínez (2016),
through a questionnaire designed ad hoc to a representative and stratified sample of
students, specifically to 976 students of the Faculty of Education, detected that virtual
tutoring is the most used despite being the least valued, either due to ignorance or due
to being considered the most valuable direct and personal relationship for development
academic.
Until now, a recommendation system has not been created that responds to the
guidance and tutoring needs that the Higher Education students may have. The system
presented by the TIMONEL project is innovative, proving to be an essential tool in the
area of educational guidance. The objectives of educational recommendation systems
are to provide accurate information to students according to their preferences, user
profile and learning objectives (Busto López et al. 2016). Therefore, our recommen-
dation system tries to respond to the orientation needs of undergraduates by analyzing
their personal profiles, making their recommendations useful for other users, thus
obtaining a feedback system.
Taking into account the aforementioned information, the scarce literature previ-
ously found and presented in relation to the recommendation systems in educational
guidance, and the requirement to analyze the needs detected by undergraduates, the
objective of the present investigation is to determine the needs perceived by the stu-
dents regarding the ICT use given by teachers in their educational guidance function
found by an inadequate tutoring, as well as analyzing the factors and elements per-
ceived by teachers that allow quality guidance. Thus, the main hypothesis of investi-
gation is the existence of needs in terms of guidance through ICT.
ICT Impact in Orientation and University Tutoring 247
2 Method
2.1 Design
This study presents a mixed perspective according to the triangulation of the data
collection performed. This type of method combines the qualitative and quantitative
method giving relevance to the research potential (Creswell and Zhang 2009). The
qualitative part analyzed the results obtained in discussion groups. The quantitative
phase follows a descriptive-exploratory design through two ad hoc validated ques-
tionnaires for undergraduates.
2.2 Participants
A total of 2779 students completed the questionnaire. The students belonged to the
universities of Jaen and Granada of Spain (UJA, UGR), Polytechnic of Coimbra of
Portugal (PIC) and Queen Mary of London (QML). Those universities were selected
since they were participants in the R + D project. Those faculties and degrees that were
similar in the four universities were selected. The conclusions could be extended given
the previous context analysis.
With regard to quantitative research, this has been formed by a proportional random
stratified sampling (except in the PIC and QML that has been intentional due to the
variability of the degrees and the difficulty of accessing the subjects in the case of the
QML) with a calculated error of 5% and according to the variables included in Table 1.
In the cases of the UJA and UGR, only the common degrees are taken into
consideration.
The sample selection of the participants of the discussion groups was intentional
depending on the research needs (Table 2).
Table 1. Quantitative sample selection

Quantitative simple
Grade Sex University
UJA UGR PIC QML
2nd grade H 141 151 108 5
M 202 225 239 21
4th grade H 132 140 34 1
M 206 234 120 5
Postgrade H 142 169 7 1
M 192 228 75 1
Total 1015 1147 583 34
Table 2. Qualitative sample selection

Qualitative sample
Technique Participants
Group University
UJA UGR PIC QML
Discussion group Grades 1°/2° curso 4 14 – 6
4° curso/Egresado 7 11 3 6
2.3 Instruments
The scale La práctica orientadora y tutorial en el alumnado y egresados universitarios
(POTAE-17) was designed for the quantitative analysis of the project. POTAE-17 has 4
dimensions: academic, personal, professional orientation and use of ICT. Only one out
of the four dimensions that form them, referring to the use of ICT in university tutoring,
is taken into consideration.
The scales have a Likert format with 5 response options (from totally disagree to
totally agree) and each scale has 61 items. In addition, the item «What overall score do
you grant to the use you have made of ICT in student orientation? (From 0 to 10)». The
psychometric characteristics of both tests, based on content validity, strength and
reliability, guarantee the degree of trust, replicability and internal consistency. As a
result, the Cronbach’s alpha, in POTAE-17, reached a value of .87, KMO (= .853) and
Bartlett’s sphericity test (v2 = 6701.698; p = .000). In addition, 4 factors are extracted
through the Kaiser criteria, which also happens to match with the theoretical model
proposed in the Confirmatory Factor Analysis.
For the qualitative analysis, discussion groups were carried out with the selected
students from different grades. In the discussion groups, questions were raised about
the needs that the students encountered in terms of academic, personal, professional
orientation and regarding the use of ICT.
2.4 Procedure
Firstly, the sample calculation was carried out for the different universities. Once the
number of students to whom the scale would be administered was established, the
different professors were asked to go to their classrooms. The scale was answered in
10 min. Subsequently, random students of 2nd and 4th year, master, doctorate and
graduates were chosen for the discussion groups. Finally, quantitative and qualitative
data of independent form were analyzed in order to reach common conclusions.
2.5 Data Analysis

For the analysis of quantitative data, the SPSS v.21 software was used. A descriptive
analysis of frequency and percentage has been done. For the analysis of qualitative
data, the audiovisual material collected was transcribed. Subsequently, the analysis was
carried out using the “Content Analysis” technique, supported by the N-Vivo v.11
software.
3 Result
3.1 Quantitative Study
In the first place, a descriptive study was carried out analyzing the students’ academic
degree with the frequency of having selected the option of “Totally agree”.
The item most often found is “On the university platform there is information
related to my subjects” especially in 2nd grade students. The most often found item in
4th grade students is “I use email in tutoring”. In this sense, García (2010) states that,
for students, the tools of technological base such as email and the virtual campus and
the different tools that this includes - are gaining ground to others more traditional
media such as the telephone, as evidenced by the fact that the frequency of use of the
first two is much higher than this last. On the other hand, the item in which the option
“Totally agree” was selected with less frequency has been “I use videoconferencing
(Skype or similar) in my tutorials” in 4th grade students, followed by the item “In
general, teachers have a social network (Facebook or similar) with their supervised
students” for 2nd grade students (Table 3).
Table 3. Frequency and percentages in relation to the academic grade of the response “Totally
agree” (value 5)
Item Frequency Percentagee
2° 4° Postgraduate 2° 4° Postgraduate
Classes promote the 119 106 138 36.6 32.0 28.3
mastery of ICT
I know the job search 226 188 236 34.8 28.9 36.3
online
Teachers, in general, have a 214 130 124 45.7 27.8 26.5
professional website
The teacher’s website is up 166 98 64 50.6 29.9 25.2
to date
On the university platform 392 287 225 43.4 31.7 24.9
there is information related
to my subjects
I use email in tutoring 292 294 292 33.3 33.5 33.3
I use videoconferencing 19 18 31 27.9 26.5 45.6
(Skype or similar) in my
tutorials
In general, teachers have a 24 27 29 30.0 33.8 36.3
social network (Facebook
or similar) with their
supervised students
The class group has a 23 20 33 30.3 26.3 43.4
WhatsApp group in which
teachers participate
(continued)
Item Frequency Percentagee
2° 4° Postgraduate 2° 4° Postgraduate
I have a specific forum in 172 125 132 40.1 29.1 30.8
the university platform
There is a repository of 156 112 134 38.8 27.9 33.3
digital resources at our
disposal
I have a list of links to Web 133 88 110 40.2 26.6 32.2
pages that help me as
guidance in the subjects
I know resources or digital 138 85 130 39.1 24.1 36.8
networks about my studies
I am informed of the 49 34 42 39.2 27.2 33.6
possibilities of teleworking
Taking the universities as variables to be taken into account, the item “On the
university platform there is information related to my subjects” is the most frequent
when looking at the option “Totally agree” (UJA, UGR and QML). The item “I know
the job search online” is the most valued in PIC. The items less valued by students in
the different universities are “I use videoconferencing (Skype or similar) in my tuto-
rials” (UJA, PIC and QML) and “In general, teachers have a social network (Facebook
or similar) with their supervised students” (UGR) (Table 4).
Table 4. Frequency and percentages in relation to universities of the response “Totally agree”
(value 5)
Items Frequency Percentagee
UJA UGR PIC QML UJA UGR PIC QML
Classes promote the 179 118 62 4 49.3 32.5 17.1 1.1
mastery of ICT
I know the job search 271 118 134 20 41.7 34.6 20.6 3.1
online
Teachers, in general, 191 165 107 5 38.8 39.8 22.9 1.1
have a professional
website
The teacher’s website 140 113 72 3 42.7 34.5 22.0 0.8
is up to date
On the university 435 294 101 25 48.1 32.5 15.3 5.2
platform there is
information related to
my subjects
I use email in tutoring 402 360 97 19 45.8 41.0 11.0 2.2
(continued)
Items Frequency Percentagee
UJA UGR PIC QML UJA UGR PIC QML
I use 20 32 15 1 29.4 47.1 22.1 1.5
videoconferencing
(Skype or similar) in
my tutorials
In general, teachers 34 29 17 0 42.5 36.3 21.3 0.0
have a social network
(Facebook or similar)
with their supervised
students
The class group has a 37 31 25 1 48.7 40.8 19.5 0.8
WhatsApp group in
which teachers
participate
I have a specific forum 204 157 58 10 47.6 36.6 13.5 2.3
in the university
platform
There is a repository 184 142 64 12 45.8 35.3 15.9 3.0
of digital resources at
our disposal
I have a list of links to 146 120 54 11 44.1 36.3 16.3 3.3
Web pages that help
me as guidance in the
subjects
I know resources or 163 124 57 9 46.2 35.1 16.1 2.5
digital networks about
my studies
I am informed of the 54 44 25 2 43.2 35.2 20.0 1.6
possibilities of
teleworking
3.2 Qualitative Study

Among the main ideas reflected by the students in the discussion groups are the
following:
• Students know telecommuting, work management through the internet, networks,
social, etc.
• Professors do not promote enough the search for employment through the Internet,
social networks, etc.
• Professors have information in electronic format about guidance in their profes-
sional website, teaching platform, social networks, etc.
• They use to guide different tools or ICT resources (Teaching platform, forums,
social networks, etc.).
• In class, they promote ICT.
• They have a WhatsApp group in which teachers participate.
4 Conclusions
It can be perceived in the quantitative analysis that the students of the different uni-
versities require greater contact through videoconferences with the tutor to be able to be
orientated at any place and time. In addition, students demand a link with their tutor
through social networks as students consider themselves to be tools that need to be
incorporated into the academic context.
On the other hand, in the qualitative analysis, students reflect that virtual platforms
do adequately inform them about the content of the different areas of knowledge in
addition to the guidance services that universities offer. Communication channels such
as email is the tool that both tutor and students continue to use most frequently.
It could be concluded by expressing the desire of the students to obtain through
other more updated communication channels at the time of the orientation as well as a
greater development of competences by the teaching staff to be able to meet the
students’ needs in terms of technological development.
As new approaches, it is proposed to establish guidelines so that university tutors
can offer better guidance services to students through the use of ICT.
References
Álvarez, P.: Los planes de tutoría de carrera: una estrategia para la orientación al estudiante en el
marco del EEES. Educar 48(2), 247–266 (2012)
Bustos López, M., Hernández Montes, A.J., Vásquez Ramírez, R., Alor Hernández, G., Zatarain
Cabada, R., Barrón Estrada, M.L.: EmoRemSys: Sistema de recomendación de recursos
educativos basado en detección de emociones. Risti. Revista lbérica de Sistemas y
Tecnologías de Información 17, 80–95 (2016). https://doi.org/10.17013/risti.17.80-95
Coll, C., Moreno, C.: Psicología de la Educación Virtual. Morata, Madrid (2008)
Creswell, J.W., Zhang, W.: The application of mixed methods designs to trauma research.
J. Trauma. Stress.: Off. Publ. Int. Soc. Trauma. Stress. Stud. 22(6), 612–621 (2009)
Durán, R., Estay-Niculcar, C.A.: Las buenas prácticas docentes en la educación virtual
universitaria. REDU. Revista de Docencia Universitaria 14(2), 159–186 (2016). https://doi.
org/10.4995/redu.2016.5905
Fernández-Salinero, C., González, M.R., Berlando, M.R.: Mentoría pedagógica para profesorado
universitario novel: estado de la cuestión y análisis de buenas prácticas. Estudios sobre
educación 33, 49–75 (2017)
García, B.: La tutoría en la universidad de Santiago de Compostela: percepción y valoración de
alumnado y profesorado. Tesis doctoral. Universidad de Santiago de Compostela. (2010)
Martínez Clares, P., Pérez Cusó, J., Martínez Juárez, M.: Las TICS y el entorno virtual para la
tutoría universitaria. Educación XX1 19(1), 287–310 (2016). https://doi.org/10.5944/
educxx1.13942
Pantoja, A.: La acción tutorial en la universidad: propuestas para el cambio. Cultura y Educación
17(1), 67–82 (2005)
Zambrano, D.L., Zambrano, M.S.: Las Tecnologías de la Información y las Comunicaciones
(TICs) en la educación superior: consideraciones teóricas. Revista Electrónica Formación y
Calidad Educativa (REFCalE) 213–228 (2019)
Blockchain Security and Privacy in Education:
A Systematic Mapping Study
Attari Nabil1(&), Khalid Nafil1, and Fouad Mounir2

1
Mohammed V University, Rabat, Morocco
[email protected], [email protected]
2
ENFI, Sale, Morocco
[email protected]
Abstract. Blockchain is a technology that promotes security, transparency and

authenticity. It is defined by as distributed ledger technology and consensus
algorithm, it’s first application were in the financial sector with Bitcoin,
Ethereum and other cryptocurrency, but now the concept has spread to other
sectors like Healthcare, Governance, Education, etc.
In the education sector all stakeholders need a reliable technology that will
improves the education system and takes it to the next step. The Blockchain has
all the key characteristics to do just that (distributed ledger, no third parties,
chronological timestamp, cryptographically sealed, consensus based), and to be
used in bigger scale it has to gives assurance such as data privacy and data
security.
This Systematic Mapping Study (SMS) tries to study existing papers in
blockchain security and privacy and its application in Education to provide a
classification and to explore the artifacts and the domains.
Keywords: Education Blockchain Privacy Security Systematic Mapping

Study
1 Introduction
In the last decade, Blockchain became famous, beginning with the financial sector with
Bitcoin, Ethereum and other cryptocurrency and then the technology has spread to
other sectors like Education, it is a technology that secures settlement of transactions
using a cryptography and creating a block that will be broadcasted to the blockchain
network.
The technology is promoting security and privacy to make it difficult to alter the
content but the risk of data leakage is not impossible [5, 8], a systematic review was
published in 2019 for Blockchain cyber security and concluded that 45% of the studies
concerned IoT and only 7% on data privacy [19], we hope that this study will help to
join security and privacy with the application of Blockchain in Education.
A lot of studies has proven that blockchain is applicable to Education dAppER an
automated decentralized application for examination [15], SmartCert that guarantee
data security and confidentiality using cryptographically sealed records [13],
254 A. Nabil et al.
CredencedLedger a permissioned Blockchain platform was proposed for verifying

academic credentials [2] for security it used built-in security features, but few were
concerned with data protection.
This Systematic Study was concerned about data protection and it can be achieved
by securing data and managing privacy and can be implemented by Blockchain
technology to improve intellectual properties protection [3].
Data security and privacy are the process of protecting data and securing it from
intrusion and unauthorized access, a solution was proposed with tamper-proof security
and decentralized structure to manage educational record [10], another a blueprint and a
framework were published for developing smart contracts to automate the way of
executing agreements [4, 8], crypto-governance was one of the solution proposed to
secure document storage and avoid fraud in Indonesian Private University [18].
The data privacy needs to be guaranteed in academic systems which will help the
adoption of the technology in large scale, because the data will be dispatched to various
node the privacy and the security need to be ensured [17].
This systematic mapping study brings attention to security issues in the Education
sector some related to the blockchain, other are more related to the resources and the
data management.
The paper is structured as follows. The first section is an introduction about the
research for Blockchain security and privacy and its application in Education. Section 2
is related to the methodology. Section 3 presents the result of this SMS and answers the
questions asked in the section before. Section 4 is the discussion before concluding in
Sect. 5.
2 Method
In this SMS we followed the process of [12] we identified 5 steps.

1. Definition of Research Questions: its goal is to identify the quantity and the type of
research and results. It will be reflected in research questions.
2. Conduct Search for primary Studies: the primary studies are found by searching on
scientific databases or browsing manually through relevant conference proceedings
or journal publications.
3. Screening of Papers for inclusion and exclusion: In this step we will use criteria to
include and exclude studies irrelevant to answer research question.
4. Keywording of Abstracts: keywording is done by reading abstracts and looking for
keywords and concepts then the set of keywords from different papers are combined
to have a better understanding about the contribution of the research.
5. Data Extraction and Mapping of Studies: in this step we used Excel table to doc-
ument the data extraction to help us analyze the result and be able to identify gaps
and the possibilities of improvement.
Blockchain Security and Privacy in Education 255
2.1 Definition of Research Questions

Based on the systematic review that were published [1, 9] and the process described in
[12] we asked the following questions to provide an overview of research articles
where education join blockchain security. Our research questions (RQ) are shown in
(Table 1).
Table 1. Research questions

ID Research question
RQ1 How many research papers are produced?
RQ2 How publications are distributed across countries?
RQ3 What is the quality of relevant articles according to conference and/or journal rank?
RQ4 What are the obtained domains and artifacts?
2.2 Conduct Search for Primary Studies

During the search we searched Google Scholar for other Systematic review about
application of Blockchain in Education then, searched Primary Studies by using search
string on four scientific databases: IEEE Xplore, ACM digital library, Springer, Science
Direct. The search was in the sectors of Computer Science and Software Engineering.
The Search string shown in (Table 2) was applied to Titles, Keywords and
Abstracts.
Table 2. Search string

Segment Search string
Education EDUCATION AND
Blockchain BLOCKCHIAN AND
Security SECURITY OR PRIVACY
2.3 Screening of Papers for Inclusion and Exclusion

In this step we began by screening all titles before reading abstracts to look for studies
that can contribute to our context. We excluded studies that did not focused on data
security or privacy in blockchain for education sector like described in (Table 3).
256 A. Nabil et al.
Table 3. Inclusion and exclusion criteria.

Inclusion criteria Exclusion criteria
The paper cites blockchain in education The paper cites blockchain without
The paper cites security or privacy aspect in contributing in education
blockchain
English scientific journal and conference
papers
3 Result: Data Extraction, Analysis and Synthesis
In this section we used all the stored data about the relevant articles to answer our
research questions.
RQ1 - How Many Research Papers are Produced?
We have found 452 papers in total, 66% from Science Direct, 15% from IEEE Xplore,
14% from Springer Link and only 9% from ACM Digital Lib, as shown in (Fig. 1) and
(Table 4). All the publications were found between 2014 and 2020.
ACM Digital
Lib
5%
Springer
Link
14%
IEEE Xplore
15%
Science
Direct
66%
ACM Digital Lib Springer Link IEEE Xplore Science Direct
Fig. 1. Percentage of articles found by the search string
After Exclusion process with the criteria in (Table 3) only those between 2018 and
2019 were relevant for our research, 56% of the papers were in IEEE Xplore, 19% in
Science Direct, 13% in Springer Link and 12% in ACM Digital Lib. We can see that the
concern about security and privacy in education began in 2018 with the first publication
of blockchain in education was about data privacy and it was in January 2018 it
Table 4. Found and relevant articles by library

Library # of papers founded # relevant paper Studies
ACM Digital Lib 24 2 [7, 10]
SpringerLink 63 2 [3, 14]
IEEE-Xplore 66 9 [2, 4, 6, 8, 11, 13, 15, 17, 18]
ScienceDirect 299 3 [5, 16, 19]
Total 452 16
11
1
9 1 ACM
Science Digital
Direct Lib
3
6 19% 12%
1
Springer
1 Link
0
4 13%
5
4
1
IEEE
-2 Xplore
2018 2019 56%
IEEE Xplore Science Direct ACM Digital Lib Springer Link

Springer Link ACM Digital Lib
IEEE Xplore Science Direct
Fig. 2. Evolution of publications per year Fig. 3. Libraries in which the selected articles
have been published
“propose a framework utilizing Hyperledger fabric and Hyperledger composer to grant

authorization right for data access relying on smart contracts” [8] (Figs. 2 and 3).
RQ2 - How Publications are Distributed Across Countries?
The aim of this answer is to list the laboratories that are interested in the blockchain in
education as shown in (Fig. 4). We can see that USA and India are the one with the
most relevant papers but in the same time it’s a new subject and the result can change
next year as more research will discuss Blockchain Security in Education.
258 A. Nabil et al.
Brazil
USA 9%
18%
Germany
9%
United
Kingdom
9%
Oman India
9% 28%
Jordan
9% Indonesia
9%
Brazil Germany
India Indonesia
Jordan Oman
United Kingdom USA
Fig. 4. Countries of origin of publications
RQ3 - What is the Quality of Relevant Articles According to Conference and/or

Journal Rank?
We chose to evaluate each article according to a score given at the conference or the
journal where it was published. We opted for Scimago journal & Country Rank for the
scientific journal and CORE Conference Ranking for the conferences.
To standardize the ranking between journals and conferences, we made a new score
based on the following nomenclature: ((Table 5) show the results obtained)
Table 5. Quality assesment

Score Studies
5 [5, 14, 19]
4 [4, 6, 18]
3 [10]
2 –
1 –
0 [2, 3, 7, 8, 11, 13, 15–17]
• For journals: (+5) if the journal ranking is Q1, (+4) if the journal ranking is Q2, (+3)
if the journal ranking is Q3, (+2) if the journal ranking is Q4, and (+1) for others.
• For conferences: (+5) if the conference is CORE A, (+4) if the conference is

CORE B, (+3) if the conference is CORE C, (+2) if the conference is CORE D, (+1)
if the conference is not ERA ranked but According to Qualis, and (+0) for others.
RQ4 - What are the Obtained Domains and Artifacts?
This question was asked to summarizes the researches previously done by domains like in
(Table 6) (Fig. 5) and by artifacts shown in (Table 7) (Fig. 6). We were able to identify
gaps and opportunities, and like [12] we reported frequencies as shown in (Fig. 7).
Table 6. Representative domains

Artifacts Studies
Degree verification [2, 7, 14, 17]
Academic certificate [2, 6, 11, 13, 14, 17]
Data privacy [2–5, 8, 10, 11, 16, 18]
Data security [2, 3, 6, 7, 10, 11, 13, 15–17, 19]
Authorization [8]
Authentication [13]
Smart contract [4, 8, 16, 19]
Cryptography [5, 11, 15, 18]
Degree
Smart contract Verification
10% 8%
Academic
Authorization Certificate
10% 15%
Authentication
3%
Cryptography
3%
Data Privacy
23%
Data Security
28%
Degree Verification Academic Certificate

Data Privacy Data Security
Cryptography Authentication
Authorization Smart contract
Fig. 5. Application domains

260 A. Nabil et al.
Table 7. Artifacts
Artifacts Studies System
Analysis [3–5, 11, 16, 19] Design
Case study [2, 6, 13, 14] Proof of 6%
concept
Framework [8, 10, 17] Analysis
13%
Proof of concept [15, 18] 37%
System design [7]
Frame-
work
19% Case
Analysis Study
Case Study 25%
Framework
Proof of concept
System Design
Fig. 6. Proposed artifact
System Proof of Framework Case Study Analysis

Design concept
Smart contract 1 3
Authorization 1
Authentication 1
Cryptography 2 2
Data Security 1 2 1 3 3
Data Privacy 1 2 1 5
Academic
1 4 2
Certificate
Degree
1 1 2
Verification
Fig. 7. Coverage of domain of blockchain security and privacy in education by proposed artifacts
4 Discussion
This SMS was conducted to investigate blockchain security and privacy and its
application in Education field. We selected 16 relevant papers framed around three
major themes: Blockchain in Education, Security and Privacy. To the author’s
knowledge, this is the first Systematic Study around this topic. Most of the studies we
found were between 2018 and 2019 even if the Search String was done to retrieve all
papers from 2014 but most of them didn’t answer the questions of Blockchain Security
and Privacy in Education. Concerning the artifacts, 37% of the papers were analysis
and 25% case study but only 13% were Proof of concept which prove that we are in the
beginning of the application of data security and data privacy to Blockchain in
Education.
5 Conclusion and Future Scope
The Blockchain technology is a great asset to Education it will help to move forward
with digitalizing schools and universities and to reduces the cost and secure student’s
data. A lot of studies were done for Blockchain application in Education but just few of
them were concerned about data protection which was proven in this Systematic Study.
The data and information that will be stored into blockchain will be targeted by many
malicious persons and will be at risk. This Systematic Study was the first step to get an
idea of the state of art in blockchain security and privacy and its application in edu-
cation, our future work will focus on blockchain vulnerabilities in data protection and
data right management and the possibilities of applying it to Blockchain in Education.
References
1. Ali Alammary, A., Alhazmi, S., Almasri, M., Gillani, S.: Blockchain-based applications in
education: a systematic review. Appl. Sci. 9(12), 2400 (2019)
2. Arenas, R., Fernandez, P.: CredenceLedger: A Permissioned Blockchain for Verifiable
Academic Credentials (2018)
3. Chen, G., Xu, B., Lu, M., Chen, N.S.: Exploring blockchain technology and its potential
applications for education. Smart Learn. Environ. 5(1), 1 (2018)
4. Farah, J.C., Vozniuk, A., Rodríguez-Triana, M.J., Gillet, D.: A Blueprint for a Blockchain-
Based Architecture to Power a Distributed Network of Tamper-Evident Learning Trace
Repositories (2018)
5. Feng, Q., He, D., Zeadally, S., Khan, M.K., Kumar, N.: A survey on privacy protection in
blockchain system. J. Netw. Comput. Appl. 126, 45–58 (2019)
6. Franzoni, A.L., Cárdenas, C., Almazan, A.: Using Blockchain to Store Teachers’
Certification in Basic Education in Mexico (2019)
7. Ghaffar, A., Hussain, M.: BCEAP - A Blockchain Embedded Academic Paradigm to
Augment Legacy Education through Application (2019)
8. Gilda, S., Mehrotra, M.: Blockchain for Student Data Privacy and Consent (2018)
9. Yumna, H., Khan, M.M., Ikram, M., Ilyas, S.: Use of blockchain in education: a systematic
literature review. In: Intelligent Information and Database Systems, January 2019
262 A. Nabil et al.
10. Han, M., Li, Z., He, J., Wu, D., Xie, Y., Baba, A.: A Novel Blockchain-based Education
Records Verification Solution (2018)
11. Al Harthy, K., Al Shuhaimi, F., Al Ismaily, K.K.J.: The Upcoming Blockchain Adoption in
Higher-Education: Requirements and Process (2019)
12. Petersen, K., Feldt, R., Mujtaba, S., Mattsson, M.: Systematic mapping studies in software
engineering. In: Proceedings of the 12th International Conference on Evaluation and
Assessment in Software Engineering, June 2008
13. Kanan, T., Obaidat, A.T., Al-Lahham, M.: SmartCert BlockChain Imperative for Educa-
tional Certificates (2019)
14. Lizcano, D., Lara, J.A., White, B., Aljawarneh, S.: Blockchain-Based Approach to Create a
Model of Trust in Open and Ubiquitous Higher Education (2019)
15. Mitchell, I., Hara, S., Sheriff, M.: dAppER: Decentralised Application for Examination
Review (2019)
16. Mohanta, B.K., Jena, D., Panda, S.S., Sobhanayak, S.: Blockchain Technology: A Survey on
Applications and Security Privacy Challenges (2019)
17. Srivastava, A., Bhattacharya, P., Singh, A., Mathur, A., Prakash, O., Pradhan, R.: A
Distributed Credit Transfer Educational Framework based on Blockchain (2018)
18. Taufiq, R., Trisetyarso, A., Kosala, R., Ranti, B., Supangkat, S. and Abdurachman, E.:
Robust Crypto-Governance Graduate Document Storage and Fraud Avoidance Certificate in
Indonesian Private University, August 2019 (n.d.)
19. Taylor, P.J., Dargahi, T., Dehghantanha, A., Parizi, R.M., Choo, K.K.R.: A Systematic
Literature Review of Blockchain Cyber Security, February 2019
The Development of Pre-service Teacher’s
Reflection Skills Through Video-Based
Classroom Observation
Ana R. Luís(&)
University of Coimbra/CELGA-ILTEC, Coimbra, Portugal

[email protected]
Abstract. In initial teacher education, effective teaching practice is often only

available in the 2nd year of the MA program after pre-service teachers have
completed their (theoretically-oriented) first year. Typically, no prior preparation
for the direct exposure to teaching practice is available. A pedagogical experi-
ment was therefore conducted aimed at filling in this gap by engaging students
in video-enhanced classroom observation during the first year of their initial
teacher education program. This paper lays out the pedagogical experiment and
reports on the students’ perception of the benefits of classroom observation for
the development of their teaching and reflection skills. Overall, the study reveals
that preparation for the internship is significantly enhanced by the use of video
which allows students to adjust their observation to their own pace and rhythm
as well as encourage the joint analysis and discussion within a collaborative
learning context.
Keywords: Classroom observation Digital video Language teaching

Teacher education Teacher reflection Foreign languages
1 Background
In this section we survey background literature on initial teacher education (Sect. 1.1)
and discuss the pedagogical motivation underlying the use of video-based classroom
observation (Sect. 1.2).
1.1 Initial Teacher Education

Within the context of initial teacher education, there is often the need to ensure that
training programs help bridge the gap between theoretical content and teacher practice
[1, 2]. As such, teacher preparation should focus not only on the development of
language skills but also on the ability to apply pedagogical knowledge to teacher
practice [3]. Generally, however, there is a tendency for MA programs not to offer such
balance making it more difficult for students to transfer didactic content to real teaching
contexts: during the first year, students concentrate on areas such as pedagogy and
didactics, while the second year is dedicated to developing pedagogical practice during
their internship [4]. In order to offer MA students earlier contact with some teaching
practice, experiments involving micro-teaching have been developed to help narrow the
264 A. R. Luís
gap between theory and practice [5]. Through micro-teaching students are given the
opportunity to experiment different teaching methods, explore pedagogical strategies
and receive constructive comments within a controlled teaching environment. This
article reports on a pedagogical strategy involving video-based classroom observation
which is also aimed at preparing future language teachers for an effective classroom
performance and which, in effect, may be combined with micro-teaching practices (cf.
Sect. 4).
1.2 Developing Teacher Reflection Skills

The importance of reflection in teaching dates back to the early 20th century when
teachers were encouraged to reflect on their own experience to formulate new strategies
that would enhance both teaching and learning [6]. The main assumption was that
focusing on specific situations encourages teachers to identify problems and search for
solution making them more self-aware and critical about their own teaching practice.
Currently, reflective thinking is perceived as a path which promotes a collaborative
environment and encourages student teachers to jointly participate in a “process of self-
examination and self-evaluation” aimed at improving teacher practice [7, 8]. On the
contrary, teachers unwilling to engage in reflective teaching will plan and teach based
on unexamined “common sense” assumptions. Reflective thinking crucially involves
analyzing classroom activity in all its richness, including planning and teaching. One
effective way of training and developing teacher reflection is through classroom
observation, analysis and discussion [9].
1.3 Classroom Observation and Digital Video

Within the context of initial teacher education, classroom observation provides a first
contact with real instances of teaching and learning [10]. For pre-service teachers with
very little teaching experience, there are crucial benefits in observing teacher practice:
students develop an awareness of the pedagogical complexity of teaching and become
more familiar with the practical implications of lesson planning and time management
[4, 11]. Essential analytical and reflexive skills are also trained at the various stages of
the classroom observation process with the help of an observation form which is
organized into relevant pedagogical domains and sub-domains to direct pre-service
teachers’ attention to critical areas of teaching [2, 12].
The use of video in classroom observation has recently drawn significant attention
within the teacher education context, as it has been widely shown that it increases the
teachers’ attention to classroom interactions [13, 14]. Among the benefits associated to
video-based classroom observation are, for example, the fact that video captures details
which happen simultaneously and may therefore go unnoticed during direct (in-class)
observation. In addition, video does not require an immediate response from teachers
and promotes repeated viewing from different perspectives. Also, by using video-
recorded classes it is possible to view the same sequence more than once or to choose
which moments to review [15]. There is therefore evidence that asynchronous class-
room observation increases both the quantity and the quality of the classroom obser-
vation experience [16, 17].
The Development of Pre-service Teacher’s Reflection Skills 265
2 Description of the Experiment
In this section we lay out the pedagogical experiment by focusing on the goals, the
target audience and the methodology adopted to evaluate the students’ perception.
2.1 Goals and Audience

The purpose of using video-based classroom observation during the first year of the
initial teacher education MA program was to develop the reflexive capacity of pre-
service teachers, in particular, to train their ability to observe, analyse and discuss the
different components of a language class, such as the transition between the different
stages, the contents, the materials, the methodologies, teacher feedback, teacher-student
interaction, time management, among other.
The target audience consisted of students enrolled in the Master’s degree in English
Language Teaching and the Master’s degree in Teaching English and a Foreign
Language at the University of Coimbra, during the 2nd semesters of 2016–2017 and
2017–2018.
2.2 Methodology: Procedure and Instruments

The pedagogical experiment was carefully organized into three stages, namely the pre-
observation stage (cf. (a)), the observation stage (cf. (b)) and the post-observation stage
(cf. (c)) [11]:
a) During the preparatory phase students became familiar with the observation form,
by reviewing concepts, identifying the critical areas and training data collection.
b) The observation phase was aimed at watching video-recorded classes and com-
pleting the observation form. During this stage, the video-recording was occa-
sionally interrupted to review any critical moments which might have been missed
by the pre-service teachers.
c) During the post-observation phase the data collected during the observation phase
was shared and discussed among pre-service teachers. This phase stimulated the
ability to analyse classroom practice and raise awareness of the multidimensional
nature of language teaching.
Two kinds of instruments were used for data collection:
• A classroom observation sheet [12] which was divided into nine pedagogical
domains with each domain being subdivided into further subdomains (cf. Table 1).
This sheet comprised 38 subdomains which covered different aspects of teacher
performance (cf. Table 2).
• Three video-taped English language classes available on the https://celta.
wikispaces.com platform, which were accessed several times while the teaching
experience took place (cf. Table 3).
266 A. R. Luís
Table 1. Domains of classroom observation sheet
Table 2. Sub-domains of classroom observation sheet (partial illustration)
Table 3. Video-recorded English language classes
2.3 Evaluation
The degree of achievement of the experiment was assessed through an open-ended
questionnaire survey which was completed by the students at the end of the year (cf.
Table 4). The research questions underlying the questionnaire are shown below:
• To what extent does video-based classroom observation enhance pre-service
teachers’ awareness of the multidimensional nature of teacher practice within an
English language context?
• What are the perceptions of pre-service teachers’ of the benefits of video-based
classroom observation on their ability to improve their future teaching practice?
The open-ended questionnaire (cf. Table 4) was organized as follows: Section 2,
containing questions 2.1 to 2.10, was aimed at finding out what pre-service teachers
had learned from observing and discussing video-recorded English language classes.
Section 3 focused specifically on the students’ perception of the benefits of engaging in
classroom observation. Section 4 attempted to determine which pedagogical domains
had the greatest impact on pre-service teachers.
Table 4. Questionnaire
In what follows, we report on the results of our qualitative research study. We survey
the answers provided by pre-service teachers to our open-ended questionnaire.
As mentioned before, the goal of asking students “What did you learn about …?”
was to evaluate their awareness of the various teaching domains and their ability to
apply previously studied theoretical-didactic content to concrete teaching and learning
contexts. Table 5 contains two samples of the students’ answers which effectively
reveal that the students were able to identify to link previous methodological course
content to detailed and specific aspect of the observed teacher performance.
In line with studies about the role of video in teacher professional development,
these results confirm that video-based classroom observation encourages more inex-
perienced teachers to engage confidently in the observation experience. What seems
evident is that the development of pedagogical thinking clearly benefits from the
flexibility provided by video giving them the opportunity to adjust the observation, the
analysis and the reflection to their own pace and rhythm [10].
268 A. R. Luís
Table 5. Sample of students’ answers to the question “What did you learn about…?”
In response to the question “What is your overall opinion about observing video-
taped English classes?”, the replies revealed that pre-service teachers had a very pos-
itive perception of engaging in video-based classroom observation. As the sample
provided in Table 6 shows, their receptive attitude was determined by their growing
awareness of the multiple layers of teacher performance, such as i) the pedagogical
strategies developed by the teachers to enhance learning; ii) the activities carried out by
teachers for the development of specific language skills; iii) the teacher feedback; iv)
the instructions; v) the management of time and space; vi) body language; vii) teacher-
student interaction, among others.
It is worth noting that pre-service teachers were able to share their analytical and
reflexive skills during their joint observation and discussion. As has been noted in
previous research, video-enhanced teacher reflection does effectively stimulate col-
laborative learning and can contribute to the development of an emerging teacher
identity [10, 16].
Table 6. Sample of students’ answers to the question “What is your overall opinion about
observing video-recorded English classes?”
The last question of the questionnaire (“Is there anything you would like to apply to
your own practice …?”) was aimed at identifying the impact the experiment had on
students’ own practice as future language teachers. A small sample of the student’s
answers is given in Table 7. The responses further reveal a growing awareness of
fundamental classroom-specific issues such as careful lesson planning, enhancement of
oral skills through group and peer work, teacher-student interaction, a balance between
more controlled and more autonomous classroom activities, the carefully- planned use
of the white/black-board, error correction, teacher body language, among others.
These findings are also consistent with previous studies which show that students
are willing to adjust their practice to the teaching strategies observed in the video-
recorded classes and that therefore classroom observation effectively improves stu-
dents’ own lesson planning and classroom performance [13]. Ultimately, video-based
classroom observation promotes sustained teacher reflection and enhances teacher
noticing [9, 14].
Table 7. Sample of students’ answers to the question “Is there anything you would like to apply
to your own practice …?”
4 Conclusion
Underlying this experiment was the observation that initial teacher education programs
need to be supplemented with opportunities that allow students to become familiar with
the dynamics of the classroom and teacher practice. Classroom observation raises
students’ awareness of the classroom dynamics and the interaction between the various
pedagogical domains. In addition, the use of video allows students to adjust the
observation experience to their own needs and feel more encouraged to train their
analytical and reflection skills. The sample answers have revealed a very positive
perception of video-enhanced classroom observation.
Based on what we have learned, a future line of work would include the video-
recording of the students own classes within a micro-teaching context [5]. Self-
observation through video has been shown to significantly raise teachers’ awareness of
themselves giving them the opportunity to make more detailed reflections of their own
weaknesses and strengths [16]. It also broadens the scope of classroom observation not
only by developing self-assessment skills but also because classroom reflection would
be directed at the students’ own teaching practice [13]. These and other research
questions may also be productively extended to other subject knowledge areas, espe-
cially within the context of MA programs in which theory and practice are still poorly
articulated [1, 2].
270 A. R. Luís
References
1. Korthagen, F.: How teacher education can make a difference. J. Educa. Teach. 36(4), 407–
423 (2010)
2. Korthagen, F., Kessels, J., Koster, B., Lagerwerf, B., Wubbels, T.: Linking Practice and
Theory: The Pedagogy of Realistic Teacher Education. Routledge, London (2001)
3. Wallace, M.: Training Foreign Language Teachers. Cambridge University Press, Cambridge
(1991)
4. Flores, M., Vieira, F., Ferreira, F.: Formação inicial de professores em Portugal: problemas,
desafios e o lugar da prática nos mestrados em ensino pós-Bolonha. In: Borges, M.C.,
Aquino, O.F. (eds.) A formação inicial de professores: olhares e perspectivas nacionais e
internacionais, pp. 61–96. EDUFU, Uberlândia (2014)
5. Allen, D., Eve, A.: Microteaching. Theory Into Pract. 7(5), 181–185 (1968)
6. Dewey, J.: How We Think: A Restatement of the Relation of Reflective Thinking to the
Educative Process. Heath, Boston (1933)
7. Ottesen, E.: Reflection in teacher education. Reflective Pract. Int. Multi. Perspect. 8(1), 31–
46 (2007)
8. Larrive, B.: Transforming Teaching Practice: Becoming the critically reflective teacher.
Reflective Pract. Int. Multi. Perspect. 1(3), 293–307 (2010)
9. Scida, E., Firdyiwek, Y.: Video reflection in foreign language teacher development. In:
Allen, H.W., Maxim, H.H. (eds.) Issues in Language Program Direction: Educating the
Future Foreign Language Professoriate for the 21st Century, pp. 231–237. Heinle, Boston
(2013)
10. Marsh, B., Mitchell, N.: The role of video in teacher professional development. Teach. Dev.
10, 403–417 (2014)
11. Richards, J., Farrell, T.: Classroom observation in teaching practice. In: Richards, J., Farrell,
T. (eds.) Practice teaching: A Reflective Approach, pp. 90–105. Cambridge University Press,
New York (2011)
12. Thornbury, S., Watkins, P.: The CELTA Course. Trainer’s Manual. Cambridge University
Press, Cambridge (2007)
13. Grant, T., Kline, K.: The impact of video-based lesson analysis on teachers’ thinking and
practice. Teacher Dev. Int. J. Teachers’ Prof. Dev. 14(1), 69–83 (2010)
14. Sherin, M.: New perspectives on the role of video in teacher education. In: Brophy, J. (ed.)
Using video in teacher education, pp. 1–27. Elsevier, London (2004)
15. Sherin, M., Rus, R.: Teacher noticing via video. In: Calandra, B., Rich, P. (eds.) Digital
Video for Teacher Education: Research and Practice. Routledge, New York (2014)
16. Maclean, R., White, S.: Video reflection and the formation of teacher identity in a team of
pre-service and experienced teachers. Reflective Pract. Int. Multi. Perspect. 8(1), 47–60
(2007)
17. Welsch, R., Devlin, P.: Developing preservice teachers’ reflection: examining the use of
video. Action Teacher Educ. 28(4), 53–61 (2012)
Formative Assessment and Digital Tools
in a School Context
Sandra Paiva1,2, Luís Paulo Reis1,3(&) , and Lia Raquel2

1
LIACC - Artificial Intelligence and Computer Science Lab, Porto, Portugal
[email protected]
2
IE/UMinho – Institute of Education, University of Minho, Braga, Portugal
[email protected]
3
FEUP – Faculty of Engineering of the University of Porto, Porto, Portugal
[email protected]
Abstract. Digital tools with an emphasis on the so-called apps are a current
topic whose potential, in school context, is still little explored. On the other
hand, the formative evaluation is also very little explored and is of particular
importance in the context of inclusive education, as an extended vision, in which
all students have specificities. The review of the literature in the context of the
cross-referencing of the two topics - formative evaluation and the use of apps -
presents good indicators of how technology can complement the formative
evaluation, guarantee a greater rooting of the same and guarantee performance
more aligned with education inclusive, taking into account the profile of the
student after leaving compulsory schooling. In this context, a descriptive and
analytical study was carried out, using a survey on the use of digital tools and
formative evaluation. The results obtained allowed us to conclude that the
school environment will have to evolve, above all, to the level of material
resources. This evolution is less pressing at the level of human resources and
attitudes that promote the use of formative assessment techniques (FATs) and
apps. Thus, there is an opportunity to improve existing applications in order to
allow greater aid of formative evaluation, by attenuating its greater limitations.
Due to the evolution of the applications, the prospects of gaining more use of the
apps and FATs are widened.
Keywords: Education Formative evaluation Digital tools
1 Introduction
In the school context, the use of digital tools, particularly those commonly called apps,
is a current topic whose potential is still little explored. Paradoxically, formative
assessment is a longstanding topic. However, it is also little used and may be of prime
importance in the context of inclusive education as a broad view in which all students
have specifics. Pacheco [1] states that formative assessment is absent from assessment
practices, although the normative framework since 2005 considers it the main mode of
assessment. Consequently, the pedagogical differentiation in school curricular prac-
tices, from elementary to secondary school, is also absent, with a prevailing curriculum
uniformity [1]. The assessment of subjects and organizations has been the central axis
272 S. Paiva et al.
in the various attempts to model education in the same way as markets, as there is no
competitiveness without assessment. The processes of differentiation are performed
through external evaluations (exams, intermediate tests, PISA, etc.) and rankings, thus
replacing the logic of internality and formative evaluation (Machado, 2017, p. 337).
The diversity of apps focused on existing education, the panoply of Formative
Assessment Techniques (FATs) and their reduced knowledge, as well as the lesser use
of both, especially the two realities analyzed in this study, in order to enhance synergies
between your intersection. The main focus of this study also focuses on the reasons
inherent in the reduced rooting of Formative Assessment Techniques and the support
that the school provides in the use of Information and Communication Technologies
(ICT). This focus extends to teachers’ attitudes towards some changes that the digital
environment brings to the school environment.
The importance of deepening knowledge about the difficulties of establishing
formative evaluation is due to the fact that this evaluation presents innumerable sci-
entific evidences of its potential to support students and teachers in overcoming
weaknesses and disabilities, with very positive impacts on motivation, engagement,
achievement and autonomy.
Spector et al. [2] highlight that Ecclestone [3], supported by Johnson et al. [4],
Narciss [5], Spector [6], Woolf [7], argued that formative assessment or assessment for
learning is now considered an integral component of good teaching, student motiva-
tion, engagement and higher levels of achievement [2].
Faber and Visscher [8], highlight their view on feedback referring to Konstan-
topoulos, Miller and Van der Ploeg [9] and Kluger and Denisi [10], so feedback seems
to be more efficient when assessments are more frequently. Feedback focused on the
learning task can be efficient, while directed towards the self as a person is not efficient
Kluger and Denisi [10]. They also conclude that, in general, the elaborated feedback is
more effective in learning than the simple one, continuing the conclusions of Bangert-
abafa et al. [11] and Shute [12]. More effective is the feedback developed when it is
combined with the establishment of performance goals (Locke and Latham [13]).
Bhagat and Spector [14], in the light of an extended retrospective analysis by
Narcisse [5] and Spector and Yuen [15] on research conducted over the past 50 years or
more on learning, concluded that there are three main outcomes. (a) time on task
predicts learning outcomes, (b) formative feedback tends to improve learning, and
(c) prior knowledge influences the learning experience. The second result is of rele-
vance to the present study [4].
Spector et al. [2], referring to Ellis [16], identify as one of the main limitations of
formative assessment the difficulty of collecting learning interaction data and outcomes,
and also the analysis of formative feedback and assessment. With the use of 21st
century new technologies these limitations of formative assessment are removed, given
easy access to and analysis of performance and evaluation data. The development of
key 21st century skills - critical thinking and problem solving can also be supported by
new technologies. For the development of challenging situations formative assessments
are also valued (large and multiclass classrooms, problem-based learning), particularly
technology-supported assessment [2].
Formative Assessment and Digital Tools in a School Context 273
In addition, timely and informative feedback is recognized as an element capable of

improving and accelerating learning (Bransford, Brown, and Cocking, [17]; Clariana,
[18]); Epstein et al., [19]; Hannafin, [20]; Kluger and Denisi, [10]; Narciss, [5]).
Bhagat and Spector [14] indicate that there is not enough research on formative
assessment to support learning by simple tasks, with results focused on simple concepts
and procedures. The explosion of new technologies makes this support increasingly
effective. What needs further understanding is the best way to support the learning of
complex and poorly structured tasks and the best use of new technologies [4].
In the context of the intersection of the two topics - formative assessment and the
use of apps - the literature review presents good indicators of how technology can
complement formative assessment, ensure its greater rooting and ensure a performance
more aligned with inclusive education, in view of the student’s profile upon leaving
compulsory education.
The research question and its sub questions are: Does the school environment have
material, human resources and attitudes that promote the use of formative assessment
and apps for this purpose? (1) What is the position of teachers regarding the charac-
teristics and skills of digital age students? (2) How do the material and human resources
made available in the school affect the use of ICT? (3) What are the privileged teaching
activities in the use of ICT? (4) How deep is the use of TAF (Formative Assessment
Techniques)? (5) To what extent is the presence of apps and mobile devices in the
school environment? (6) To what extent will it be possible to create a New App Rating
Model for Formative Assessment?
2 Methodology
The research is carried out according to the Mixed Methodology (MM), meaning, two
methods coexist - the qualitative and quantitative. In this context, a quantitative
investigation was carried out. In the qualitative component, content analysis techniques
are applied to the few scientific productions, in the thematic scope - formative eval-
uation and new technologies. From the identification of patterns and trends in the
results of previous studies, the quantitative component is defined. This component
applies quantitative techniques to capture research evidence that corroborates or con-
trasts the results of the first component.
Day-to-day observation is the starting point of this component. An observation
focused on the contradictions or incongruities of everyday life, emerging from the
reality around formative assessment. Aimed more specifically at the difficulties of
establishing formative assessment, in a reality that is tendentially shrouded in summa-
tive assessment, this reality, by contrast, is legally regulated for an essentially formative
assessment [21]. This incongruity has been going on for a long time since 1992, when
formative assessment is more legally defined (Normative Order 98/A/92) [22].
The collection of documentary data provided information on this social phe-
nomenon and continued this study by directing the quantitative component.
274 S. Paiva et al.
Spector et al. [2], one of the main limitations of formative assessment is the dif-
ficulty of collecting learning interaction data and outcomes and also the formative
feedback and assessment analysis.
Tsai, Tsai and Lin [23], for formative assessment individualized online learning is
crucial because the feedback provided by formative online assessment is immediate
and because the computer allows students to self-assess and improve immediately
(Tsai, Tsai, and Lin, [8], accessed July 2018, p. 260).
Also, from day-to-day observation was another incongruity: the fact that, on the
one hand, the most popular apps do not include problem solving and other complex
learning; and, on the other hand, the fact that the Student Profile Technical Report -
21st Century Competencies undervalued knowledge and valued metacognitive
knowledge and meta-competence. For Ferraz and Belhot [24], meta-knowledge is an
equivalent knowledge: knowledge used for problem solving and/or the choice of the
best method, theory or structure; strategic knowledge.
In order to understand this phenomenon, we proceeded to collect data, highlighting
some of the most relevant one. De Spector et al. [2], given the history of emphasis on
formative assessment and the potential of new technologies to extend formative
assessment in complex problem-solving areas, the potential for a greater impact of
formative assessment on skills development with regard to higher order learning is high
[2]. From Bhagat and Spector [14], there is not enough research on formative
assessment to support learning by simple tasks, with results focused on simple concepts
and procedures. The explosion of new technologies makes this support increasingly
effective. What needs further understanding is how best to support the learning of
complex and poorly structured tasks and how best to use new technologies [4].
The quantitative component was based on data collection through the use of a
questionnaire on the use of digital tools and formative assessment to a random sample
of active teachers from different levels of education (1st cycle to higher education).
It should be noted that no more specific criteria were defined according to the
project topic or research questions, i.e. the selection of teachers with experience in the
application of form-active assessment or the use of mobile devices was not defined as a
requirement. and apps. (Creswell ) [26].
In the questionnaire we chose, above all, the type of information related to facts,
opinions and attitudes, consisting of several sections: The profile of teachers using ICT;
Formative and Summative Assessment and Apps used in education. The survey was
based on two relevant projects: the Acer-European Schoolnet Pilot Project (2013) [27]
and the Project: Teachers’ Perceptions of the Digital Transformation of the Classroom
through the Use of Tablets: A Study in Spain (2016) [28].
3 Questionnaire
3.1 Profile of ICT Teachers
The survey was randomly applied to forty-six teachers from different cycles and levels
of education and training. There was a massive participation of teachers of the 1st cycle
of basic education - 55.8%, followed by a participation of 23.3% of teachers of the 2nd

cycle of basic education. With a less significant participation, the other educational
groups were represented: 3rd cycle of basic education - 9.3%; Secondary education -
7%; Higher Education - 2.3%. There was also a participation of a teacher who teaches
from the 2nd cycle to high school, with the expression of 2.3%,
The professional experience of the teachers participating in the survey is over 12
years and less than 42 years of service. The most present age ranges correspond to the
following ranges and percentage value: between 12 and 19 years - 39% and between 20
and 29 years - 39%. This is followed by the interval between 30 and 39 years old, with
an expression of 15%. Finally, 7% of the sample represent teachers with length of
service between 40 and 42 years.
The subjects and subject areas indicated by the teachers coincide. For this reason,
the data analysis of questions 3. and 4. are analyzed together. Since the 1st cycle of
basic education operates on a single teaching basis, the subjects of this level of edu-
cation were not included. The teachers who participated in this survey, from the 2nd
cycle of basic education to higher education, teach mainly the following subjects:
Mathematics (25%); Natural or Natural Sciences (17%); Portuguese (16%); English
(14%). Fewer respondents teach the subjects of: History and Geography of Portugal
(5%) and Biology and Geology (5%). Finally, there are a set of subjects that are taught
in a smaller number: STEM (3%); Robotics (3%); ICT and Computer Applications
(3%); Citizenship and Development (3%); Technological Education (3%); Physical
Education (3%).
The vast majority of teachers have the perception that it is necessary to guide
students in the most profound and meaningful use of technology, i.e. 90.9%, of which
43.5% totally agree with this evidence and 47.8% only indicate your agreement. 6.8%
of teachers are indifferent, 2.3% disagree and 0% totally disagree. The following Fig. 1
shows these conclusions.
Fig. 1. Degree of agreement by teachers on the need for guidance from students on the use of
technology for deeper and more meaningful use.
Teachers ‘positive perceptions of digital age students’ cognitive, perceptive, sen-

sory, and motor transformations are less consistent than previous perceptions, as shown
in the graph in Fig. 2. Although there is a significant percentage value - 68.2%
276 S. Paiva et al.
agreeing, there are nonetheless 0% of teachers who fully agree. This optimistic view is
indifferent to 11.4% of teachers. It is seen as a negative change by 15.9% of teachers
and is seen as an ambivalent change by 2.3%, i.e. both negative and positive. Addi-
tionally, 2.3% of teachers doubt this change.
Fig. 2. Teachers ‘degree of agreement on students’ cognitive, perceptive, sensory, and motor
transformations.
Regarding the level of support provided by the school in the use of ICT varies
depending on whether it is the use of interactive whiteboards or mobile devices. In
terms of both maintenance and maintenance, these conclusions are expressed in Fig. 3.
Regarding the use of the interactive whiteboard, the percentage of teachers who
agree or disagree with the support provided is close, 45% and 46%, respectively. Only
9% of teachers are indifferent to this support. Regarding the use of mobile devices, the
distribution of opinions is less balanced, as 61% of teachers disagree with the existence
of the support provided and 22% agree and only 17% show indifference on the subject.
Regarding the support provided in the provision of training workshops, it was
concluded that only 33% of respondents disagree with its existence, as opposed to 50%
agree, 17% of teachers remain indifferent to this issue. The opinion of respondents on
the availability of debates is different, 52% disagree, 31% agree and 17% are
indifferent.
Fig. 3. Evidence of school support for ICT use

Regarding the training received on ICT on the use of the Internet and general
applications, the pedagogical use of ICT or related training on such devices and
equipment we found that 39% of respondents indicate that they receive it very often
and often. This percentage somewhat contrasts with the figure of 50% of respondents
who, in the previous question, partially or fully agree that their school offers training
workshops. 24% of teachers admit that they rarely or never receive this training and
37% occasionally. These data are shown in the graph of Fig. 4.
Fig. 4. Evidence of school support for ICT training
The answers given in the context of receiving training at the level of debates
communities were closer to those given in question 7. Thus, gathered 57 percentage
values that indicated they never or rarely participated in debates communities and 28%
participated occasionally.
Fig. 5. Frequency of ICT use for different purposes
As for the use of ICT for different purposes leading the use for performing
administrative tasks, 100% of respondents reported use Very often and Often followed
by use for follow-up classes and assessment - 87% also followed for use for plan and
teach - 85% and finally use 66% to communicate with parents. The remaining values
corresponding to occasional, rare use and total non-use have a low value expression.
This use of ICT is expressed in the graph of Fig. 5.
278 S. Paiva et al.
In translating this use into years, depending on the different ends, we find that the
three ends most indicated in question nine also correspond to the time intervals of most
years indicated by the respondents. The use of ICT for planning and teaching is most
widespread at 6 to 10 years - 22%, 11 to 15 years - 24% and 16 to 20 years - 46%. To
perform administrative tasks ICTs are most commonly used in the following time
frames: 6 to 10 years - 24%, 11 to 15 years - 26% and 16 to 20 years - 41%. To follow
up the classes and for the evaluation the same intervals have the following percentage
expression: 28%, 35% and 30%, respectively. Finally, the purpose for which ICT is
least used - communicating with parents, the most marked time intervals are only the
following: 6 to 10 years (33%) and 11 to 15 years (28%). To this analysis it is
interesting to add that the time interval of the 0-year scale was only indicated for this
last purpose and has a weight of 11%.
3.2 Formative and Summative Assessment

We proceed to the analysis, from the data of the graph in Fig. 6, of the teachers’
position regarding the citation level of agreement: Formative evaluation can sometimes
have less emphasis and support, given the excessive emphasis placed on summative
evaluations, such as grades, standardized test results, comparative rankings, and annual
performance ratings, Spector et al. [2].
We found that the vast majority of respondents are in a favorable position towards
the quote, that is, 69.6% agree or totally agree. In contrast, 30.4% of teachers disagree
or strongly disagree. Please note that this quote was not indifferent to any respondent.
Fig. 6. Teachers’ level of agreement with the evidence that formative assessment can sometimes
be less emphasized and supported by the excessive emphasis on summative assessment.
The survey of twelve FATs points to a small number. Regarding the knowledge of
eight techniques, 78% of the teachers present some, reduced or no knowledge. Of the
remaining four techniques, about 60% of teachers have better knowledge - high and
good knowledge. Since the first two techniques - Constructive Mini-Tests and Filling in
Text Gaps, the survey represents a greater knowledge in relation to the last two -
Learning Portfolio and Student Logbook.
The number of reasons presented as determinants for the less emphasis given to
formative assessment was 60, as shown in the graph in Fig. 7. Of this universe, 20% of
respondents attribute the cause to the overvaluation of external evaluation, either by
legislation, supervision, teachers or family, in the form of examinations, rankings and

scoring tests. We highlight the reference of a respondent who assigns responsibilities to
the media for the excessive importance given to the rankings of schools.
The cause is also attributed to the overvaluation of the summative assessment, with
an expression of 18% values. It is relevant to transcribe one of the answers “the
existence of summative assessment”.
The undervaluation of formative assessment and the indication of its limitations is
32% in this survey. Some of the limitations mentioned are: the laborious process; the
difficulty of registering and drawing up evaluation grids; the inaccuracy of the
assessment; the difficult control of the variables; the absence of classification. Limi-
tations were also mentioned by comparison with the faster and more accurate sum-
mative assessment. This underappreciation of formative assessment is also regarded as
a tradition. Interestingly, the formative assessment was referred to as “less tangible”.
Another reason mentioned, several times, is the short time for the implementation
of formative assessment, answered 13% of respondents.
With lower percentage expression, the following reasons were presented: reduced
formation - 5%; legislative determination - 5%; extensive resumes - 5%; high number
of students - 2%; The nature of practical subject values other means of evaluation.
Finally, 3% of respondents did not identify any reason.
We understand that the most mentioned reason, among all, is the undervaluation of
the evaluation and its limitations.
The frequency of use of FATs was also measured. We focus on the most knowl-
edgeable techniques on the part of teachers, we find that, in general, there is a lower
frequency of use (Very Frequently and Frequently) compared to High Knowledge and
Good Knowledge, a decrease in the order of 10% to 19%. Thus, the knowledge and
frequency of use of each technique have the following values: Constructive mini-tests -
57%, 51%; Filling in text gaps - 85%, 68%; Sorted queues - 43%, 32%; Graphic
organizers - 37%, 27%; Learning portfolio - 55%, 45%; Student logbook - 50%, 31%,
as shown in Fig. 7.
Fig. 7. Frequency of use of FATs
Finally, our attention was directed to the use of apps in education, primarily in the
use of ICT in teaching activities, i.e. the use of interactive whiteboard, mobile devices,
publisher software, apps and from others.
From the graph data in Fig. 8, we find that there is a significant percentage of
respondents who never use ICT in teaching activities: Interactive whiteboard - 35%;
280 S. Paiva et al.
Mobile devices - 41%; Publisher software - 20%; apps - 33%; others - 28%. Possibly
these values are related to the conclusions drawn from the analysis of the answers to
question 7. Therefore, it seems logical that there are values close to the non-use of
interactive whiteboards - 35%, and the 46% of the teachers agree that they do not feel
supported by the school in providing and maintaining interactive whiteboards. It also
seems to be justified that mobile devices have the highest percentage of non-use - 41%
and also have the highest percentage - 61%, in relation to the level of support that the
school provides.
We conclude that there is a balance between the number of teachers who use Very
Frequently and Frequently the interactive whiteboard, mobile devices, apps and others,
and teachers who use them occasionally or infrequently. The values are respectively
35% and 30% - interactive whiteboard; 31% and 28% - mobile devices; 30% and 37% -
apps; 35% and 37% - others. The values for publishers’ software use differ from this
balance, as 67% of respondents indicate that they use it very often and frequently and
only 13% use it occasionally or rarely.
Fig. 8. Teaching activities developed using ICT
We conclude that publisher software is largely one of the most commonly used
ICTs in class, compared to using the interactive whiteboard, mobile devices, apps, and
more.
The translation of the use of mobile devices in school time corresponds to the
following values: 0% of the time - 41.3%; 20% of the time - 41.3%; 40% of the time -
10.9%; 60% of the time - 2.2%; 80% of the time - 2.2%; 100% of the time - 0%; indefinite
time depending on planning - 2.2%. There is evidence that their use is low, so the vast
majority of teachers - 82.6% never use mobile devices or use them only 20% of the time.
There seems to be a direct relationship between the 41.3% corresponding to 0%

usage and the 41% relative to teachers who never use mobile devices - conclusion of
the answers to question 15. However, there seems to be some contradiction between the
percentage of teachers who never use mobile devices. Mobile devices use 20–40% of
the time - 52.2%, and the percentage of teachers who use Mobile Very Frequently and
frequently - 31%.
3.3 Apps Used in Education

Regarding the level of knowledge of the different apps, there is clear evidence that the
vast majority of apps recommended for educational use are not known to teachers,
which somewhat contradicts the fact that in question 15. There is a significant number
of teachers who use Very Frequently and Frequently apps - 30%.
From the 25 apps presented we identified 19 whose percentage value regarding No
knowledge is high - between 63% to 72%. Below are 4 apps with a percentage value
between 50% and 57% for No Knowledge. Two apps stand out - Google drive and
Skype, Google hangouts and Viber, because they have their highest percentage values
at higher levels of knowledge of the scale - High knowledge and Good knowledge.
Google drive has the following percentages: 41% and 28%. Skype, Google hangouts,
and Viber indicate other percentages: 26% and 15%.
We have made a comparative analysis between the level of knowledge of apps and
their frequency of use in class. We focus our attention only on the apps mentioned
above. Thus, we found that, of the 25 apps, 19 presented percentages between 86% and
87% referring to the frequency of use. Never, they correspond to the same ones that
presented, in question 17, the following percentages: between 63% and 72%, related to
No knowledge. Below are 4 apps with percentage values between 71% and 86%
relative to frequency of use. Never, these apps correspond to apps with the following
percentage values: 50% and 57%, referring to No knowledge. Google drive and Skype,
Google hangouts, and Viber, previously highlighted apps, have significantly lower
percentage values compared to knowledge. Thus, Google drive has 15% (Very often)
and 15% (Frequently) against the previous values 41% (High knowledge) and 28%
(Good knowledge). Skype, Google hangouts, and Viber report these percentages: 6%
(Very often) and 13% (Often) as opposed to the previous 41% (High Knowledge) and
28% (Good Knowledge).
This study shows that digital tools, with an emphasis on applications, are a current
topic whose potential, in the school context, is still little explored. In addition, for-
mative assessment is also little explored, but it is of particular importance in the context
of inclusive education. The literature review made by cross-referencing these two
topics - shows good indicators of how technology can complement formative assess-
ment. The descriptive and analytical study made it possible to conclude that the school
environment should evolve, above all, to the level of material resources. This evolution
is less pressing in terms of human resources and attitudes that promote the use of
282 S. Paiva et al.
evaluations and training applications. Thus, there is an opportunity to improve existing

applications, in order to allow greater assistance for formative assessment, enabling its
use in a broader and deeper way. Future work will be related to the extension of this
study to other educational cycles and to the increase in the number of applications
analyzed.
Finally, we present the answers to two research sub-questions.
How ingrained is the use of TAF? Most of the twelve elected TAFs are unknown or
have little or no knowledge, on average 78%. The reasons for this reduced use are both
intrinsic in nature to the formative assessment but also extrinsic. As already mentioned,
Spector et al. [2] lists several intrinsic reasons, seen as its main limitations: difficulty in
collecting learning interaction data and results and, also, the analysis of formative
feedback and evaluation. The extrinsic reason for formative assessment may be the
excessive emphasis given to summative assessment [2]. We conclude that many of the
limitations of the formative assessment indicated may be alleviated in an individualized
online learning, given the functionalities, referred on [8]: feedback is immediate, stu-
dents self-assess and improve immediately.
To what extent does the presence of apps and mobile devices make itself felt in the
school environment? We found that the vast majority of teachers - 82.6%, never use
mobile devices or use them only 20% of the time. Also, we verified that, of the 25 apps,
19 presented percentages between 86% and 87% referring to the frequency of use
“Never” and percentages between 63% to 72%, related to “No knowledge”. These
conclusions are in line with those of Bhagat and Spector [4]: there is not enough
research on formative assessment to support learning by simple tasks, with results
aimed at simple concepts and procedures. The explosion of new technologies makes
this support increasingly effective.
Acknowledgements. This work was supported by Project FCT/UID/CEC/0027/2020 – LIACC:

Artificial Intelligence and Computer Science of the University of Porto.
References
1. Pacheco, J.: A avaliação das aprendizagens: para além dos resultados in Revista Portuguesa
de Pedagogia, p. 261 (2006)
2. Spector, J., et al.: Technology-enhanced formative assessment for 21st century learning.
Educ. Technol. Soc. 19(3), 58–71 (2016)
3. Ecclestone, K.: Transforming Formative Assessment in Lifelong Learning. McGraw-Hill
Education, Berkshire (2010)
4. Johnson, L., Adams Becker, S., Cummins, M., Estrada, V., Freeman, A., Hall, C.: NMC
Horizon Reporthigher Education Edition. New Media Consortium, Austin (2016)
5. Narciss, S.: Feedback strategies for interactive learning tasks. In: Spector, J.M., Merrill, M.
D., van Merriënboer, J.J.G., Driscoll, M.P. (eds.) Handbook of Research on Educational
Communications and Technology, 3rd edn., pp. 125–144 (2008)
6. Spector, J.M.: Foundations of Educational Technology: Integrative Approaches and
Interdisciplinary Perspectives, 2nd edn. Routledge, New York (2015)
7. Woolf, B.P.: A Roadmap for Education Technology. The National Science Foundation,
Washington, DC (2010). http://cra.org/ccc/wp-content/uploads/sites/2/2015/08/GROE-
Roadmap-for-Education-Technology-Final-Report.pdf
8. Faber, J., Visscher, A.: The effects of a digital formative assessment tool on spellingachieve-
ment: results of a randomized experiment (2018). https://www.sciencedirect.com/science/
article/pii/S0360131518300617
9. Konstantopoulos, S., Miller, S.R., van der Ploeg, A.: The impact of Indiana’s system of
interim assessments on mathematics and reading achievement. Educ. Eval. Policy Anal. 35
(4), 481–499 (2013). https://doi.org/10.3102/0162373713498930
10. Kluger, A.N., DeNisi, A.: The effects of feedback interventions on performance: a historical
review, a meta-analysis, and a preliminary feedback intervention theory. Psychol. Bull. 119
(2), 254–284 (1996). https://doi.org/10.1037/0033-2909.119.2.254
11. Bangert-Drowns, R.L., Kulik, C.-L.C., Kulik, J.A., Morgan, M.: The instructional effect of
feedback in test-like events. Rev. Educ. Res. 61(2), 213–238 (1991)
12. Shute, V.J.: Focus on formative feedback. Rev. Educ. Res. 78(1), 153–189 (2008). https://
doi.org/10.3102/0034654307313795
13. Locke, E.A., Latham, G.P.: Building a practically useful theory of goal setting and task
motivation: a 35-year odyssey. Am. Psychol. 57(9), 705–717 (2002). https://doi.org/10.
1037//0003-066X.57.9.705
14. Bhagat, K., Spector, J.: Formative assessment in complex problem-solving domains: the
emerging role of assessment technologies. Educ. Technol. Soc. 20(4), 312–317 (2017)
15. Spector, J.M., Yuen, H.K.: Educational Technology Program and Project Evaluation.
Routledge, New York (2016)
16. Ellis, C.: Broadening the scope and increasing the usefulness of learning analytics: the case
for assessment analytics. Br. J. Educ. Technol. 44(4), 662–664 (2013)
17. Bransford, J.D., Brown, A.L., Cocking, R.R.: How People Learn: Brain, Mind Experience,
and School (expanded edition). National Academies Press, Washington, DC (2000)
18. Clariana, R.B.: A comparison-until-correct feedback and knowledge-of-correct response
feedback under two conditions of contextualization. J. Comput.-Based Instr. 17(4), 125–129
(1990)
19. Epstein, M.L., et al.: Immediate feedback assessment technique promotes learning and
corrects inaccurate first responses. Psychol. Rec. 52, 187–201 (2002)
20. Hannafin, M.J.: The effects of systemized feedback on learning in natural classroom settings.
J. Educ. Res. 7(3), 22–29 (1982)
21. PT Decree-Law no. 17/2016 of 4 April gives an eminently formative dimension to the
evaluation
22. Normative Order 98 / A / 92
23. Tsai, F.-H., Tsai, C.-C., Lin, K.-Y.: The evaluation of different gaming modes and feedback
types on game-based formative assessment in an online learning environment, Elsevier,
p. 260 (2014)
24. Ferraz, A.P.C.M., Belhot, R.V.: Taxonomia de Bloom: revisão teórica e apresentação das
adequações do instrumento para definição de objetivos instrucionais, Scielo, p. 428 (2010)
25. Faria, E., Rodrigues, I., Perdigão, R., Ferreira, S.: Perfil do aluno - competências para o
sécxulo XXI, relatório técnico, Conselho Nacional de Educação, p. 7 (2017)
26. Creswell, J.: Projeto de pesquisa: métodos qualitativo, quantitativo e misto. Porto Alegre:
Artmed, p. 189 (2010)
27. Balanskat, A.: Introdução de Tablets nas Escolas: de Tablets nas Escolas: Avaliação do
Projeto-Piloto de Tablets Acer-European Schoolnet, pp. 1–8 (2013)
28. Suárez-Guerrero, C., Lloret-Catalá, C., Mengual-Andrés, S.: Teachers’ perceptions of
thedigital transformation of the classroom through the use of tablets: a study in spain.
ComunicarXXIV, 81–89 (2016). nº. 49
Information Technologies in
Radiocommunications
Compact Slotted Planar Inverted-F Antenna:
Design Principle and Preliminary Results
Sandra Costanzo(&) and Adil Masoud Qureshi
Università della Calabria, Rende, CS, Italia

[email protected]
Abstract. A new slot configuration for a compact Planar Inverted-F Antenna is

presented. The proposed structure enables a size reduction up to 25%, as
compared to the standard PIFA design. A parametric analysis of the design
parameters is presented to characterize the behavior of slotted PIFA, and a
T-shaped ground plane technique is adopted to enhance the operation bandwidth
in the ISM-band (2.4–2.5 GHz), thus being useful candidate for wearable
applications.
Keywords: PIFA Miniaturization ISM-band Wearable
1 Introduction
Planar Inverted-F Antenna (PIFA) is one of the most popular configuration in consumer
electronics [1, 2]. It is widely used, due to its compact size and desirable radiation
features. PIFA provides a relatively high gain for electrically small antennas, and it is
able to conform with SAR regulations [3, 4]. However, with never ending miniatur-
ization of mobile devices, there is a need for ever smaller antennas [5].
PIFA is usually integrated on printed circuit boards, so the most straightforward
method is to adopt a substrate material with high permittivity. However, this would
lead to a higher loss, resulting in a lower gain and reduced radiation efficiency [6].
Loading PIFA with a capacitive or resistive impedance can also reduce its size.
Unfortunately, these techniques also suffer from similar drawbacks [7]. Modern
methods to reduce the PIFA footprint include the use of metamaterial ground planes or
superstrates [8–10], but these specialized 3D structures can imply very high manu-
facturing costs, as well limiting the type of substrate materials which can be used.
In this paper, the authors propose a new slot configuration able to reduce the
resonant frequency of PIFA configuration, while maintaining its physical size, thus
being equivalent to a miniaturization [11]. The adoption of slots to modify the resonant
wavelength of PIFA is not completely new [12], however the majority of existing
designs can be classified as meandered antennas [7], since they rely on increasing the
length of the current path between the short circuit and the open circuit ends of the
radiating element. Meandered designs often result into degraded radiation patterns and
lower efficiencies, due to the zig-zag current flow [13, 14]. The method presented in the
present contribution does not depend on meandering. A parametric analysis of the
design is presented to explain its behavior. Side-by-side comparison of the standard
288 S. Costanzo and A. M. Qureshi
PIFA with an enhanced bandwidth, ISM-band variant of the proposed design, is also
presented. The proposed slotted antenna provides almost identical radiation features
with electrically smaller size.
2 Slotted Design and Parametric Analysis
A conventional square PIFA [15, 16], shown in Fig. 1(a), is adopted as starting point in
the present slotted design. Two identical rectangular slots are introduced in the radi-
ating element (Fig. 1(b)) to create a ‘notch’ (Fig. 1(c)) along the diagonal of the square
patch. As a result of the above modifications, the slotted PIFA includes three new
design parameters, namely the width ‘W’ and the depth ‘D’ of the slots, as well as the
minimum width of the notch ‘B’. Each one of these parameters is varied to examine
their effect on the resonant frequency and the impedance bandwidth. The basic
parameters of PIFA, such as the size of the radiating element, the ground-plane and the
shorting plate, are leaving unchanged in the parametric analysis. Since the input
impedance of a PIFA is highly sensitive to the feed position, it has to be optimized for
each variation. However, in order to limit the effect of the feed position on the resonant
frequency, the feed location is constrained to be within a range of ±1 mm around the
original location.
6 mm
W
17.5 mm
Shorting Plate Notch

Probe Feed
D B
Radiating
Element
Ground
Plane
(a) (b) (c)
Fig. 1. Evolution of the slotted PIFA and its design parameters.
Figure 2 shows the simulated return loss for slots of width ‘W’ varying from
2.5 mm up to 6.5 mm. The resonant frequency varies from 2.35 GHz (@
W = 2.5 mm) back to 2.315 GHz (@ W = 3.5 mm), while all others fall inside this
range. Thus, the overall variation in the resonant frequency of the antenna is less than
1.5%, despite a nearly threefold change in the width of the slots. Furthermore, as the
resonant frequency does not monotonically increase or decrease with the width, the
corresponding relationship is not straightforward.
Compact Slotted Planar Inverted-F Antenna 289
Fig. 2. Simulated return loss of the slotted PIFA design for different slot widths
Figure 3 shows the behavior of the slotted PIFA as the slot depth ‘D’ is varied from
4 mm up to 9 mm. Again, the resonant frequency remains almost unchanged; the small
variations that exist do not seem to follow a discernable pattern. The highest resonant
frequency (2.35 GHz) is recorded for the smallest depth (D = 4 mm), while the lowest
resonant frequency (2.315 GHz) is observed at a depth of 7 mm. It is clear from Figs. 3
and 4 that the size and position of the slots is not directly related to the miniaturization.
Fig. 3. Simulated return loss of the slotted PIFA design for different slot depths.
Figure 4 shows the return loss of the slotted antenna for different sizes of the notch.
The minimum width ‘B’ is strongly coupled with the resonant frequency of the
antenna. As the notch is constricted, the resonant frequency of the antenna is reduced,
thus giving a miniaturization effect. In particular, at a value B = 0.7 mm, the slotted
antenna is 25% smaller than the square PIFA design (Fig. 4).
Fig. 4. Simulated return loss of the slotted PIFA design for different notch widths, compared
with the simple square PIFA.
Based on the above parametric analysis, it is evident that the minimum width ‘B’ of
the notch determines the resonant frequency of the slotted PIFA. The size and the
position of the slots is irrelevant, as long as the width of the notch is preserved. It may
be also observed from Fig. 4 that miniaturization is achieved at the cost of bandwidth.
As the resonant frequency is reduced, the useable bandwidth also becomes smaller.
However, the loss of bandwidth can be easily compensated by specific enhancement
methods, as demonstrated in the following section.
3 ISM-Band Slotted PIFA
After the preliminary parametric analysis, a slotted PIFA design, optimized for oper-
ation in the Industrial, Medical and Scientific (ISM) band (2.4–2.5 GHz), is simulated.
The ISM band is commonly used by consumer electronics employing WiFi and
Bluetooth standards for communication. The overall size of the antenna is identical to
the square PIFA earlier described (Fig. 1). The square PIFA has a resonant frequency
equal to 2.82 GHz, whereas the slotted PIFA is resonant at 2.41 GHz. However, the
impedance bandwidth of the slotted PIFA is much smaller, less than 10%, while the
square PIFA has a bandwidth of 20% (Fig. 5). To improve the bandwidth of the
proposed slotted PIFA, a T-shaped ground plane modification is introduced [17], with a
resulting design revealing an impedance bandwidth of over 19.5% (Fig. 5).
A comparison of the radiation patterns of the two PIFA designs is shown in Fig. 6.
The slotted PIFA, despite being 15% smaller (electrical size), has almost identical
radiation pattern. The antenna provides linearly polarized radiation with a peak gain
around 3.2 dBi.
Compact Slotted Planar Inverted-F Antenna 291
Fig. 5. Return loss comparison of the slotted PIFA, slotted PIFA with T-shaped ground plane
(inset) and the Square PIFA.
Fig. 6. Co-polar (solid) and Cross-polar (dashed) radiation patterns of the slotted compact PIFA
and the classical square PIFA.
4 Conclusion
A new slot configuration for microstrip PIFA miniaturization has been demonstrated.
The technique has revealed to reduce the resonant frequency of the PIFA, which is
equivalent to a reduction in the antenna size. The gain and the impedance bandwidth of
the compact slotted PIFA, designed for ISM band, is comparable to the standard PIFA
design. The proposed architecture is particularly useful for portable and wearable
electronics.
References
1. Fujimoto, K. (ed.): Mobile Antenna Systems Handbook, 3rd edn. Artech House, Boston
(2008)
2. Young, P.R., Aanandan, C.K., Mathew, T., Krishna, D.D.: Wearable antennas and systems.
Int. J. Antennas Propag. 2012, 1–2 (2012)
3. Rais, N.H.M., Soh, P.J., Malek, F., Ahmad, S., Hashim, N.B.M., Hall, P.S.: A review of
wearable antenna. In: 2009 Loughborough Antennas & Propagation Conference, Lough-
borough, pp. 225–228. IEEE (2009)
4. Rogier, H.: Textile antenna systems: design, fabrication, and characterization. In: Tao, X.
(ed.) Handbook of Smart Textiles, pp. 1–21. Springer, Singapore (2015)
5. Nepa, P., Rogier, H.: Wearable antennas for off-body radio links at VHF and UHF Bands:
challenges, the state of the art, and future trends below 1 GHz. IEEE Antennas Propag. Mag.
57(5), 30–52 (2015)
6. Lo, T.K., Yeongming Hwang.: Bandwidth enhancement of PIFA loaded with very high
permittivity material using FDTD. In: IEEE Antennas and Propagation Society International
Symposium, 1998 Digest. Antennas: Gateways to the Global Network. Held in conjunction
with: USNC/URSI National Radio Science Meeting, vol. 2, pp. 798–801 (1998). (Cat.
No.98CH36
7. Waterhouse, R.B. (ed.): Printed Antennas for Wireless Communications. Wiley, Chichester
(2007)
8. Hall, P.S., Hao, Y. (eds.): Antennas and Propagation for Body-Centric Wireless Commu-
nications, ser. Artech House Antennas and Propagation Library: Artech House, Boston
(2006). oCLC: ocm70400755
9. Gao, G., Hu, B., Wang, S., Yang, C.: Wearable planar inverted-F antenna with stable
characteristic and low specific absorption rate. Microwave Opt. Technol. Lett. 60(4), 876–
882 (2018)
10. Gao, G., Yang, C., Hu, B., Zhang, R., Wang, S.: A wearable PIFA with an all-textile
metasurface for 5 GHz WBAN applications. IEEE Antennas Wireless Propag. Lett. 18(2),
288–292 (2019)
11. Costanzo, S., Venneri, F.: Miniaturized fractal reflectarray element using fixed-size patch.
IEEE Antennas Wireless Propag. Lett. 13, 1437–1440 (2014)
12. Wong, K.L.: Planar Antennas for Wireless Communications. Wiley, Hoboken (2003)
13. Rothwell, E.J., Ouedraogo, R.O.: Antenna miniaturization: definitions, concepts, and a
review with emphasis on metamaterials. J. Electromagn. Waves Appl. 28, 2089–2123
(2014). https://doi.org/10.1080/09205071.2014.972470
14. Fallahpour, M., Zoughi, R.: Antenna miniaturization techniques: a review of topology- and
material-based methods. IEEE Antennas Propag. Mag. 60, 38–50 (2018). https://doi.org/10.
1109/MAP.2017.2774138
15. Taga, T., Tsunekawa, K.: Performance analysis of a built-in planar inverted-F antenna for
800 MHz band portable radio units. IEEE J. Sel. Areas Commun. 5, 921–929 (1987). https://
doi.org/10.1109/JSAC.1987.1146593
16. PIFA - Planar Inverted-F Antennas. http://www.antenna-theory.com/antennas/patches/pifa.
php
17. Wang, F., Du, Z., Wang, Q., Gong, K.: Enhanced-bandwidth PIFA with T-shaped ground
plane. Electron. Lett. 40, 1504–1505 (2004). https://doi.org/10.1049/el:20046055
Technologies for Biomedical
Applications
Statistical Analysis to Control Foot
Temperature for Diabetic People
José Torreblanca González1 , Alfonso Martı́nez Nova2 , A. H. Encinas1 ,

Jesús Martı́n-Vaquero1(B) , and A. Queiruga-Dios1
1
University of Salamanca, E37008 Salamanca, Spain
{torre,ascen,jesmarva,queiruga}@usal.es
2
University of Extremadura, Avda. Virgen del Puerto 2, 10600 Plasencia, Spain
[email protected]
Abstract. Diabetic foot is one of the main complications of diabetes in

the whole world. Its symptoms and problems get worse over time and
may include numbness, tingling, loss of sensation and pain in the limbs.
Our goal is to develop a smart sock able to control some measures such us
temperature or humidity and tells the patient whether ulcers may appear.
As one of the first steps, we want to: (i) study the best sensors to take the
temperature in the feet, (ii) give important reasons about the amount of
regions of interest (ROis) necessary to perform a good screening in the
diabetic foot, and for optimizing the study, eliminating areas that offer
redundant results. In this work, first, we analyze the different sensors
in the scientific literature. As a consequence, we find that it could be
complicated to collocate many sensors at each sock. Therefore, in the
second part, we derive a statistical analysis about which are the best 4-5
points where sensors should be placed.
Keywords: Diabetic patients · Foot temperature · Statistical analysis
1 Introduction
Diabetes, often referred to by doctors as diabetes mellitus (DM), describes a
group of metabolic diseases in which the person has high blood glucose (blood
sugar), either because insulin production is inadequate, or because the body’s
cells do not respond properly to insulin, or both. Patients with high blood sugar
will typically experience polyuria (frequent urination), they will become increas-
ingly thirsty (polydipsia) and hungry (polyphagia). If left untreated, diabetes can
cause many complications. Acute complications can include diabetic ketoacido-
sis, hyperosmolar hyperglycemic state, or death. Serious long-term complications
include cardiovascular disease, stroke, chronic kidney disease, foot ulcers, and
damage to the eyes.
Several different studies talk about the quantity of people with diabetes:
between 382 million people (in 2013) and 422 million people around the world
296 J. Torreblanca González et al.
(in 2014, according to the World Health Organization). And, in 2017, there
were 425 million people with diabetes [3]. This represents 8.3–8.5% of the adult
population [13] (in 1980 was around 4.7%), with equal rates in both women
and men [15]. As of 2014, trends suggested the rate would continue to rise [2].
Diabetes at least doubles a person’s risk of early death. From 2012 to 2015,
approximately 1.5 to 5.0 million deaths each year resulted from diabetes. The
global economic cost of diabetes in 2014 was estimated to be US$612 billion [1],
but in 2017, it was estimated in US$727 billion [3].
Therefore, this is a very important problem in the whole world. In this paper,
we focus on the so-called diabetic foot, one of its complications. A diabetic foot
is a foot that exhibits any pathology that results directly from diabetes mellitus
or any long-term (or “chronic”) complication of diabetes mellitus. Presence of
several characteristic diabetic foot pathologies are infection, diabetic foot ulcer
and neuropathic osteoarthropathy. Its symptoms vary depending on the affected
nerves. Some patients have no signs or evidence. But these symptoms and prob-
lems get worse over time and may include: ulcers, numbness; tingling; loss of
sensation and pain in the limbs; loss of muscle in the feet or hands; and changes
in heart rate.
The evaluation of the temperature of the plantar surface of the foot is a useful
tool to determine the possible risk of developing pathologies associated with dia-
betic foot. Thus, certain asymmetries have been determined, such as an increase
in temperature of 2.2 ◦ C in an area with respect to its contralateral, indicating
an underlying subclinical inflammation without apparent signs [11]. This could
open a procedure to know the risk of ulceration in the area. The determination
of temperatures in the foot is usually done by thermal imaging cameras, with
the researchers choosing different ROIs, the number and location of which are
very variable. The choice of areas of interest can be of great importance, since it
relates the increase or decrease in temperature with the risk of that area of suf-
fering an injury, such as a plantar ulcer. However, the literature offers numerous
studies, with disparity in number, location and reasons to choose ROIs. Recent
literature studies have been found that analyze from 4 [5] to 12 [11] zones, with
a number of 5-6 being the most common.
In this way, Astasio-Picado et al. [5] propose to monitor the plantar surface
of the foot in four zones, heel, first and fifth metatarsal head and first finger.
Chatchawan et al. [7] and Bagavathiappan et al. [6] propose 6 areas of interest.
These studies coincide in analyzing the heel and the plantar surface of the first
finger, although Astasio-Picado et al. [5] analyze the first metatarsal head, while
Chatchawan et al. [7] and Bagavathiappan et al. [6] extend this area to the
medial forefoot (1st and 2nd metatarsal bones). Other studies [12] focus their
attention on 5 areas of the forefoot: 1st, 3rd and 5th metatarsal bones and first
and fourth fingers. In the same way, Gatt et al. carried out a study in 8 areas of
the forefoot: medial, lateral, central and the 5 fingers [9]. However, other studies
do not specify the exact number of regions nor their location [8,14].
Although there seems to be consensus on some areas chosen, such as the heel,
first metatarsal head and first finger, the criterion of choice of the areas is not
Foot Temperature in Diabetic People 297
clear, since in these studies it was not specified. Thus, it seems that the choice
of these areas could be related to the zones of frequent appearance of ulcers [5],
but in others, the criteria were not specified.
That is why the objective of this study is to give important reasons about
the amount of ROi’s necessary to perform a good screening in the diabetic foot,
and to be able to optimize the analisis, adding or eliminating areas that offer
redundant results.
The outline of the paper is as follows: In Sect. 2 we provide an overview of
the most common sensors employed to measure the temperature, and we analyze
their main features to place them in a sock. In Sect. 3, we briefly describe the
survey taken to study the most important variables to study the temperature
in both feet. In this paper, dendrograms are used to understand where sensors
for the temperature should be placed (Sect. 3.1). Finally, some conclusions and
goals are given in Sect. 4.
2 Sensors to Obtain Temperature Measurement
The measurement of temperature is currently very well determined in industrial

processes and for some other parts of the human body, but perhaps it is not so
well resolved for measuring the sole of the foot. Currently, the most practical
sensors for temperature measurement on the sole of the foot are:
– Thermocouples: the most commonly used electrical temperature sensors in

the industry. A thermocouple is the union of two wires of different material
in a point, when applying temperature in the union of the metals a very small
tension is generated, of the order of micro or millivolts. This small tension is
what makes them difficult to measure.
The thermocouples could be classified according to several criteria such as
the material from which they are built, their tolerance or deviation, etc. The
most standardized classification is given by Table 1.
Table 1. Types of thermocouples
Type Material Voltage generated (mV)

B Platinum-Rhodium 30% vs. Platinum-rhodium 6% 0 to 10,094
R Platinum-Rhodium 13% vs. Platinum 0 to 16,035
S Platinum-Rhodium 10% vs. Platinum 0 to 13,155
J Iron vs. Constantan –7,89 to 39,130
K Nickel-Chromium vs. Nickel 0 to 41,269
T Copper vs. Constantan –5,60 to 14,86
E Nickel-Chromium vs. Constantan –9,83 to 53,11
– Thermoresistance works by varying its resistance to temperature. They have

sensitive elements based on metallic conductors, which change their electrical
resistance as a function of temperature. The most common devices are built
with a platinum resistor called PT100, PT1000, etc.
The temperature resistance ratio corresponding to the platinum wire is so
reproducible that the platinum thermoresistance is used as the international
temperature standard from −260 ◦ C to 630 ◦ C, other materials such as nickel,
nickel-iron, copper and tungsten are also used. Typically, they have a resis-
tance between 20 Ω and 20 kΩ. The most important advantage is that they are
linear within the temperature range between −200 ◦ C and 850 ◦ C (Table 2).
Table 2. Types of thermoresistances depending on the material
Material Temperature Coefficient of

Range (◦ C) variation
(%/◦ C to 25 ◦ C)
Platinum –200 to 850 0,39
Nickel –80 to 320 0,67
Copper –200 to 260 0,38
Nickel-Copper –200 to 260 0,46
– Thermistors are much more sensitive, composed of a synthesized mixture of

metal oxides, the thermistor is essentially a semiconductor that behaves like
a “thermal resistor”. They can be found on the market denoted by NTC
(negative temperature coefficient, that is, the resistance decreases with tem-
perature) and PTC (positive temperature coefficient, that is, the resistance
increases with temperature). They are much easier to measure than the pre-
vious ones, it is enough with a simple voltage divider.
In some cases, the resistance of a thermistor at room temperature can decrease
by up to 6% for every 1 ◦ C increase in temperature. This high sensitivity to
temperature variations makes the thermistor very suitable for accurate tem-
perature measurements, using it widely for control and compensation appli-
cations in the range of 150 ◦ C to 450 ◦ C.
NTCs are manufactured with a mixture of oxides of Mn, Ni, Co, Cu, Fe and
are molded in a ceramic body of various sizes, typically have a resistance
between 50 Ω and 1 MΩ at 25 ◦ C and a sensitivity of 4%/◦ C to 25 ◦ C. The
effect of negative coefficient with the temperature can result from an external
change of the ambient temperature or an internal heating due to the Joule
effect of a current flowing through the thermistor. The thermistor curve can
be linearized with a resistor mounted in parallel with the NTC.
PTCs are resistances that are mainly composed of barium and strontium
with titanium. The addition of dopants makes the semiconductor component
give a resistance characteristic with respect to temperature, although they
are rarely used.
There are more sensors with which the temperature of the foot could be
measured, but it makes them uncomfortable to wear and would be very bulky
to assemble.
– Programmable electronic devices: This type of sensors are very new and are
already integrated circuits in which the measurement of temperature variation
is made electronically, like diodes, by variation of voltage and current in the
PN junction of semiconductors. The great difference of these is that they
are already encapsulated in very small elements and that they communicate
directly with a microprocessor to know the temperature around them. There
is a great variety of models being typical examples the MAX30205, the Si7006,
the AD590, etc.
Thermocouples would be very interesting for the measurement of the foot

since they are very small and could be easily implemented in a sock or insole of
the shoe, its problem is when measuring, since small variation of tension is not
always detected. Hence, we would need a very precise measurement equipment,
which would lead to a cumbersome system to carry in the sock or in the template.
Thermistors are devices with large variation of sizes, this makes them also
interesting for this purpose of temperature measurement. Perhaps they would
be the best adapted sensors to implement a measurement system in the feet of
diabetics, the problem is in its size, if very small thermistors might be used, then
the problem would be solved, the current commercial sensors have an approxi-
mate size to a grain of rice or maybe something smaller but that would bother
the sole of the foot. Its measurement system would be easy to implement and,
in addition, its communication with a wireless system such as the mobile phone
would be ideal.
On the other hand, we have the thermoresistances that are ideal to measure
the temperature in the foot to have a very small size and they can easily be imple-
mented with the fabric of the socks, its problem is in the measurement system
because, having very small variations of its resistance, of the order of 0.39 Ω/◦ C
in the PT100, the control system must be made with four-wire measurements.
This makes that for each sensor we would need four wires to measure, this would
complicate in excess that we use many sensors in the feet. Also, more circuitry
would be needed than in the case of thermistors. From the point of view of
size, they would be the best sensors to implement, although their measurement
system would be complicated.
Finally, there are programmable devices with a smaller size and with all the
power of a programmable circuit, which makes them ideal for these applications.
As a summary, the best sensors currently would be the programmable ones,
followed by thermoresistances and thermistors. These last ones are getting
smaller and therefore better to adapt them to a sock or a template and thus
be able to take several measures.
In any case, and depending on the type of sensor, it could be complicated
to place many sensors (not more than 4 or 5), if we also want to measure other
variables such as pressure or humidity. Hence, it is necessary to know the most
important points where temperatures should be calculated.
3 Statistical Analysis
The processing of the thermal images of the sole of the foot is a new topic and
has not been investigated deeply, so there is a lack of information on thermal
patterns of the behavior of the diabetic foot [10]. This reason together with
the main conclusions obtained in the previous section motivate the statistical
analysis of this work.
We took temperatures in both feet (in sole and also dorsal) before and after
walking 100 m, in 9 points in the sole and eight corresponding points in the dorsal.
These points are shown in Fig. 1. These are the usual points considered in the
scientific literature, and they are related with areas where it is very likely that
a foot ulcer occurs according to some studies, see [4]. These areas are illustrated
in the right side of Fig. 1. Detecting problems in these zones is of great interest.
The smallest areas at risk are more or less a circle of 1 cm of diameter. This
characteristic is very important for systems that are built to detect problems in
diabetic foot.
Fig. 1. In the left side, we show a scheme with the points where the temperature data
is studied: the plant and dorsal areas correspond to the same position, except for the
number one that is only in the sole: (1) heel, (2) medial midfoot, (3) lateral mid-foot,
(4) first metatarsal head, (5) central metatarsal heads, (6) fifth metatarsal head, (7)
first finger, (8) central fingers, (9) fifth finger. In the right side, there is an illustration
with areas at risk on the foot taken from [4].
We have carried out a study to verify the influence of temperature varia-

tion on the foot of diabetic patients after a short walk of 100m. We have taken
different measures in diabetic and non-diabetic individuals so that we can com-
pare the results. A total amount of 93 people participated in this study: 23
diabetic women, 30 non–diabetic women, 21 diabetic men, and 19 non-diabetic
men. This distribution is quite homogeneous in terms of the number of data as
can be appreciated in Fig. 2. The temperature data were taken with a thermal
FLIR E60bx camera that takes images with: Resolution: 320,240 pixels; total
pixels: 76,800; thermal sensitivity: <40.045 ◦ C; accuracy: 2 ◦ C or 2% of reading;
temperature range: –4 ◦ F to 248 ◦ F (–20 ◦ C to 120 ◦ C).
Distribution with men and women by diabetic and non diabetic
men_43%
women_57%
non diabetic men_20%
diabetic men_23%
diabetic women_25%
diabetic men non diabetic women_32%

non diabetic men
diabetic women
non diabetic women
Fig. 2. Pie chart with data distribution between diabetic and non–diabetic women,
and diabetic and non–diabetic men.
Apart from the temperature of 9 points of the sole and 8 on the dorsal, other
variables, such as sex, weight, height, age, blood pressure, etc., have been taken
(Fig. 1).
This study is intended to determine if there is any variation in the temper-
ature of the foot when walking, in diabetic patients. We focused on the sole of
the foot, where there will be a greater variation in temperature, since it is the
one that is most forced when doing the walk. In future studies we will see what
happens with the dorsal and with the differences in temperatures between the
indexes of the right and left feet, as well as before and after the walk.
A basic statistical analysis has been carried out including all indices (from 1
to 9) in the sole of the foot (we will denote it later with SOLE or S) and dorsal
(denoted with the letter D), left and right feet (denoted by L and R, respectively),
before (denoted with PRE) and after (POST) the short walk. This was made
for diabetic and non-diabetic women and men.
In all cases (men and women, diabetic or not), the indices with the highest
average value are those of point 2 on both the right and left foot, both before
and after the walk, and those with the lowest average value are those of fingers
(7, 8 and 9) both left and right and before and after the walk. However, we found
the opposite situation in terms of the variation of the data with respect to the
mean (standard deviation): the ones that have more variation with respect to
the average before and after the walk are 7, 8 and 9 indices, and index 2 is the
one with the least variation.
Table 3. Variation, maximum and minimum values of some basic statistics, in diabetic
and non–diabetic men (MD and MN D respectively).
MD MN D
Var Max Min Var Max Min
mean 4.18 30.42 26.25 2.98 32.02 29.03
sd 2.68 5.28 2.60 1.98 3.94 1.96
se (mean) 0.58 1.15 0.57 0.46 0.90 0.45
IQR 6.40 10.20 3.80 2.85 4.55 1.70
skewness 0.75 0.14 −0.61 2.53 0.38 −2.15
kurtosis 1.74 0.01 −1.72 5.72 5.74 0.02
Min 0% 6.00 24.90 18.90 7.10 26.70 19.60
1stQu 25% 7.20 28.50 21.30 3.80 31.15 27.35
Median 50% 5.20 30.70 25.50 3.20 31.80 28.60
3rdQu 75% 2.30 32.80 30.50 1.90 32.85 30.95
Max 100% 1.80 35.00 33.20 3.50 36.30 32.80
Tables 3 and 4 shows a summary showing the maximum, minimum values

and the variation of each of the following statistics: mean, standard deviation,
standard error of the mean, interquartile range, skewness, kurtosis and quartiles.
Table 4. Variation, maximum and minimum values of some basic statistics, in diabetic
and non–diabetic women (WD and WN D respectively).
WD WN D
Var Max Min Var Max Min
mean 3.16 30.25 27.09 4.28 30.82 26.54
sd 2.24 4.22 1.98 2.42 4.48 2.05
se (mean) 0.47 0.88 0.41 0.44 0.82 0.38
IQR 4.85 7.05 2.20 4.55 7.10 2.55
skewness 0.88 0.52 −0.36 1.16 0.57 −0.59
kurtosis 2.91 1.69 −1.22 1.76 0.36 −1.40
Min 0% 7.80 27.10 19.30 7.00 27.10 20.10
1stQu 25% 4.85 28.70 23.85 5.98 29.43 23.45
Median 50% 3.60 30.30 26.70 5.20 30.75 25.55
3rdQu 75% 2.70 31.85 29.15 2.53 31.98 29.45
Max 100% 3.40 36.00 32.60 3.10 36.80 33.70
Table 3 shows that diabetic men reach the highest interquartile range (IQR)
and the lowest value is reached by non–diabetic men. There is a small differ-
ence between diabetic and non–diabetic women for this variable (see Table 4).
35 35
SRPRE1
SLPRE1
30
30
25
25
29.75 30.00 30.25 30.50 29.50 29.75 30.00 30.25

mean(ej[, "SLPRE1"]) mean(ej[, "SRPRE1"])
Fig. 3. Violin plot for non-diabetic men: SLPRE1 (sole, left, before the walk, point 1)
on the left and SRPRE1 on the right.
Comparing diabetic and non–diabetic men, we observe that the indices show
greater asymmetry (skewness) and kurtosis in the case of non–diabetics. In this
group (MN D ), the index of greatest asymmetry in absolute value is –2.15 and
the one with greater kurtosis is 5.74, which is SRPOST5 and corresponds to the
central part of the metatarsus of the right foot.
The closest values to zero in kurtosis are SLPRE1 and SRPRE1. We can see
it represented with a violin plot in Fig. 3. This graph corresponds to the heel of
the right and left feet. Lowest skewness value occurs in SLPRE8 and SLPRE9
(-0.22). Moreover, values with less kurtosis also have a low skewness, less than
0.4. In SLPRE1 there is an outlier with a value of 35.4.
When we look at these distributions, we appreciate that the average value
(red dot) match with the median (central line). We have represented a heat
map of diabetic men before and after the walk (right sole in this case). The
variation of values is the same, between 20 and 34 degrees, but if we look at
the dendrograms (upper part of Fig. 4) we see that after the walk the index 2
becomes the most important.
Fig. 4. Heat map (men) of the right sole before the walk (left), and heat map (men)
of the right sole after the walk (right).
Fig. 5. Heat map (women) of the right sole before the walk (left), and heat map
(women) of the right sole after the walk (right).
Although in diabetic women there is a higher temperature after the walk, we

found similar results in heat maps (Fig. 5).
However, if we compare dendrograms with men data (Fig. 4 left) and with
women data (Fig. 5 left), both before the walk, we see that in women’s case there
is a clear separation between the heel and half foot and the rest of points. For
men, the metatarsal index goes on one side and the rest on the other. After the
walk in both cases, as already mentioned before, the most important is the index
2.
3.1 Dendrograms
The main goal of this paper is to provide some strong reasons about how many
sensors we need, what they measure and which type of sensors we can use. In
this work, we found some results supporting that temperatures might be very
important, but other factors may affect (as humidity and pressure, for example).
We need to obtain tools and procedures that allow us to reduce the number of
necessary sensors.
For this reason, we also studied the dendrograms of the temperatures in
the control group (individuals without diabetes) and also for diabetic patients,
separately. We also developed a similar study before and after the walk.
Results are always quite similar: indices 7, 8 and 9 are usually strongly related
(especially 7 and 8). Something similar happens between indices 4, 5 and 6
(especially 4 and 5). Index 2 is also clearly separated from the others. Indices 1
and 3 are strongly related too. We consider that these results can be explained
with the different postures of the feet. We did not find big differences, when we
separate in different groups, between people with or without diabetes, men or
women.
Most of the ulcers appear in the sole, however, we also repeated the study in
the dorsal part of the foot. As a curiosity, in this case index 2 is more connected
with zone 3. The rest of the areas are related in a similar way as it was described
above for the sole: indices 7, 8 and 9 between them; indices 4, 5 and 6; and
2 with 3.
4 Conclusions
Nowadays, diabetes is one very important disease in the world. The number of
patients with diabetes is growing as well as its cost. In this paper, we analyzed
the temperature in the feet as one of the main variables to control complications
in patients with diabetic foot. In the future, we would like to develop a smart
sock able to control some measures such us temperature or humidity and tells
the patient if any problem is appearing, and with this new smart sock analyzing
more persons, and more continuously these factors.
As one of the first steps, we studied the best sensors to take the temperature
in the feet, and we utilized dendrograms to obtain some conclusions about the
best places where sensors should be placed. For example, if only 4 sensors might
be employed, the best zones would be: one sensor in point 2, another in 1 or
3, another one in one of the fingers, and another one in the metatarsal heads
(points 4, 5 or 6).
In the future, we would like to go deeper into the variables with higher
correlation with the temperature in the feet, and to obtain linear regressions of
the temperatures depending on these variables. In this way, we may obtain in
advance when any increase in the temperature is not explained by the model,
and therefore there might be any complication. At the same time, we continue
controlling our diabetic patients to observe which features can be utilized in
forecasting possible ulcers.
Acknowledgments. This research was funded by Fundación Samuel Solórzano grant

number FS/18-2018.
References
1. International diabetes federation. IDF diabetes atlas (2013). https://www.idf.org/
e-library/epidemiology-research/diabetes-atlas/19-atlas-6th-edition.html
2. International diabetes federation. IDF diabetes atlas (2015)
3. International diabetes federation. IDF diabetes atlas (2017). https://www.idf.
org/e-library/epidemiology-research/diabetes-atlas/134-idf-diabetes-atlas-8th-
edition.html
4. Apelqvist, J., Bakker, K., van Houtum, W., Schaper, N.: Practical guidelines on
the management and prevention of the diabetic foot: based upon the international
consensus on the diabetic foot (2007) prepared by the international working group
on the diabetic foot 24(Suppl 1), S181–S187 (2008)
5. Astasio-Picado, A., Martı́nez, E.E., Nova, A.M., Rodrı́guez, R.S., Gómez-Martı́n,
B.: Thermal map of the diabetic foot using infrared thermography. Infrared Phys.
Technol. 93, 59–62 (2018)
6. Bagavathiappan, S., Philip, J., Jayakumar, T., Raj, B., Rao, P.N.S., Varalakshmi,
M., Mohan, V.: Correlation between plantar foot temperature and diabetic neu-
ropathy: a case study by using an infrared thermal imaging technique. J. Diab.
Sci. Technol. 4(6), 1386–1392 (2010)
7. Chatchawan, U., Narkto, P., Damri, T., Yamauchi, J.: An exploration of the rela-
tionship between foot skin temperature and blood flow in type 2 diabetes mellitus
patients: a cross-sectional study. J. Phys. Ther. Sci. 30, 1359–1363 (2018)
8. Gatt, A., Falzon, O., Cassar, K., Camilleri, K.P., Gauci, J., Ellul, C., Mizzi, S.,
Mizzi, A., Papanas, N., Sturgeon, C., Chockalingam, N., Formosa, C.: The applica-
tion of medical thermography to discriminate neuroischemic toe ulceration in the
diabetic foot. Int. J. Lower Extremity Wounds 17(2), 102–105 (2018)
9. Gatt, A., Falzon, O., Cassar, K., Ellul, C., Camilleri, K.P., Gauci, J., Mizzi, S.,
Mizzi, A., Sturgeon, C., Camilleri, L., Chockalingam, N., Formosa, C.: Establishing
differences in thermographic patterns between the various complications in diabetic
foot disease. Int. J. Endocrinol. 2018, 1–7 (2018). Article ID 9808295
10. Kaabouch, N., Hu, W.-C., Chen, Y., Anderson, J.W., Ames, F., Paulson, R.: Pre-
dicting neuropathic ulceration: analysis of static temperature distributions in ther-
mal images. J. Biomed. Opt. 15(6), 1–6 (2010)
11. Macdonald, A., Petrova, N.L., Ainarkar, S., Allen, J., Plassmann, P., Whittam,
A., Bevans, J.T., Ring, F., Kluwe, B., Simpson, R.M., Rogers, L., Machin, G.,
Edmonds, M.: Thermal symmetry of healthy feet: a precursor to a thermal study
of diabetic feet prior to skin breakdown. Physiol. Meas. 38(1), 33–44 (2017)
12. Petrova, N.L., Whittam, A., MacDonald, A., Ainarkar, S., Donaldson, A.N.,
Bevans, J., Allen, J., Plassmann, P., Kluwe, B., Ring, F., Rogers, L., Simpson,
R., Machin, G., Edmonds, M.E.: Reliability of a novel thermal imaging system for
temperature assessment of healthy feet. J. Foot Ankle Res. 11(1), 1–22 (2018)
13. Shi, Y., Hu, F.B.: The global implications of diabetes and cancer. Lancet
383(9933), 1947–1948 (2014)
14. Skafjeld, A., Iversen, M., Holme, I., Ribu, L., Hvaal, K., Kilhovd, B.: A pilot study
testing the feasibility of skin temperature monitoring to reduce recurrent foot ulcers
in patients with diabetes - a randomized controlled trial. BMC Endocr. Disord. 15,
55 (2015)
15. Vos, T., Flaxman, A.D., Naghavi, M., Lozano, R., Michaud, C., Ezzati, M.,
Shibuya, K., et al.: Years lived with disability (ylds) for 1160 sequelae of 289
diseases and injuries 1990–2010: a systematic analysis for the global burden of
disease study 2010. Lancet 380(9859), 2163–2196 (2012)
Sensitive Mannequin for Practicing
the Locomotor Apparatus Recovery
Techniques
Cosmin Strilețchi1(&) and Ionuț Dan Cădar2,3

1
Technical University of Cluj-Napoca, Barițiu Street 26,
400027 Cluj-Napoca, Romania
2
“Iuliu Hațieganu” University of Medicine and Pharmacy Cluj-Napoca,
Victor Babeș 8, 400012 Cluj-Napoca, Romania
3
Clinical Recovery Hospital, Viilor 46-50, Cluj-Napoca, Romania
Abstract. This paper presents the theoretical concepts and the practical
approaches involved in constructing a mannequin (dummy) used for teaching
and practicing the recovery techniques specific to different injuries that can
affect the human locomotive apparatus. The dummy consists of a hardware
system that model the human anterior and posterior limbs. The bones, joints and
muscular tissue are replicated so that the dummy movements are very similar to
the actual movements of the human body. The mannequin is equipped with
software-controlled movement sensors. A computer that monitors the data
received from the sensors registers the parameters of the correct recovery pro-
cedures performed by a recovery specialist doctor (trainer). The students who
want to learn the procedures can practice the same maneuvers on the dummy.
The control system analyses the movement parameters, compares them with the
correct ones produced by the teacher and immediately assists the trainees by
providing an automatic feedback reflecting the correctness of the actions. This
controlled environment takes the pressure off the students and also spares the
injured patient of the inherent mistakes done involuntarily during learning the
recovery procedures.
Keywords: Sensorial mannequin Control system Safe teaching environment
1 Introduction
Nowadays, computer assisted systems are used in various fields, and healthcare is one
of them. For a good practical training and in accordance with the ethical principles, the
medical schools and the healthcare providers use dummies, who come to assist the
acquiring of proper and sufficient practical skills, specific to each medical field.
In the recovery process of the human body segments, the physical therapist must
know all the movement angles, the exact points where he must apply the necessary
force for facilitating or gaining the correct movements. Physical therapists treat patients
with fractures, tissue reconstructions, wounds or burned tissue, segments without
sensitivity or incapable of voluntary movements, stiff joints.
308 C. Strilețchi and I. D. Cădar
From an ethical point of view all the recovery techniques and maneuvers that have
to be learned by the new practitioners cannot be taught directly on patients. Also, the
majority of patients are reluctant in cooperating with students, because they are not
confident in their abilities.
The mannequin described in this paper will assist the physical therapy students in
their learning process [1]. Currently, the physical therapy students learn all the nec-
essary maneuvers on themselves, but the physiology of a healthy body doesn’t respond
in the same manner as a damaged one.
The proposed dummy will also have a significant contribution in helping the
physical therapists maintain their abilities and in avoiding malpractice or further injury
of real patients [2, 3].
The dummies are designed to provide a real-time feedback concerning the used
techniques and confer safety to the student or practitioner by eliminating the concern
about injuring a living being, thus facilitating the learning process [4].
If a therapist is trained in a controlled environment and becomes aware of tissue
feedback, the risk of pathology aggravation due to incorrect joint manipulations is
eliminated. This can also lead to optimizing the recovery time [5, 6].
2 State of the Art of Bio-Medical Mannequins
At the moment, there are devices for simulating laparoscopic surgeries, there are body
segments made of different composite materials for simulating orthopedic, abdominal,
chest, heart surgeries, endoscopic examinations, there are dummies for obstetrics,
gynecology, pediatric, samples of artificial tissues for learning the surgical techniques,
and also dummies for learning the intensive care maneuvers [7–9].
Existent CPR dummies have two parts, chest compressions and rescue breaths. For
training purposes, a good CPR dummy should have both compressions and breaths.
Besides the bare minimum, newer CPR dummies have audio and visual feedback to
quickly teach trainees proper compression depth, hand placement, rescue breaths, etc.
Currently, there are no dummies able to reproduce the feedback of the damaged
tissues and joints. Our proposed mannequin will be able to signal the errors occurred
during performing the rehabilitation maneuvers. The alarm thresholds will depend on
the lesion type (wounds, burns, fractures, inflammations, stiff joints, etc.) and on the
time elapsed since the lesion occurred.
The current dummies lack electro-hydraulic joints and also provide limited feed-
back to the practitioner [10]. The joint capsules have to be able to be programmed to
reproduce joint restrictions due to ligament, muscles and fascia tensions.
The current means of feedback (usually sonorous) have to be developed for pro-
viding extensive information in order to forewarn the therapist about the tissue tension
that appears during the joint and segments mobilizations. In addition, the feedback
should provide specific information about the part of the procedure that was prob-
lematic (applied pressure, rotation degree, duration, etc.) [11].
Sensitive Mannequin for Practicing the Locomotor Apparatus Recovery Techniques 309
3 System Description
The implemented system is composed of physical, electronic and software modules that
work together into a teaching environment specific to the human locomotor system
For each physical procedure type, the system stores several sets of valid infor-
mation produced by the professor. This data is used for matching against the datasets
obtained from the students that perform the same procedure.
3.1 Component Modules
Physical Dummy Components. The dummy body parts (arms, legs, shoulder or
pelvic joints, etc.) are made of a metallic structure covered with rubber/plastic coatings
and they model the human skeleton parts and the muscle/skin tissue that surround
them. The joint capsules model the physical limbs articulations.
Data Acquisition System. The dummy components have electronic wireless sensors
inserted into special pockets located inside the rubber parts. The sensor system is
responsible with registering and transmitting all the acquired data reflecting the
physical movements of the dummy parts (Fig. 1).
* normalization
* preprocessing
storage
valid
DB
Fig. 1. The teacher produces the valid procedures’ datasets
The monitored parameters reflect for each procedure:

– the applied pressure
– the dummy parts orientation
– the rotation degrees
– the duration
Data Processing and Analysis System. During the training phase, the teacher per-
forms on the dummy a certain procedure. Before storing the data, the acquired signal’s
samples are normalized and prepared for interpretation (preprocessed). A decision
system eliminates the irrelevant values and extracts the main characteristics of the
acquired information thus preparing it for future use.
The information acquired while the students perform the same procedure follows
the same route (normalizing and preprocessing) and the result is matched against the
valid datasets created by the tutor. The result is returned to the student as generic or
detailed feedback.
The generic feedback mechanisms display visual warnings and acoustic signals
when a certain procedure is poorly performed, while the detailed feedback provides
information about the parameters that were out of their specific range thus leading to
generating the warning signals (Fig. 2).
* normalization
* preprocessing
feedback match?
valid
DB
Fig. 2. The student’s procedures are verified against the valid datasets
3.2 Used Technologies

The physical components of the developed system (the dummy parts used for per-
forming the recovery procedures) require a lot of space for being maneuvered. The
movements have sometimes big amplitudes and therefore using wired technologies for
acquiring the data is not a way to go. This is the reason why wireless sensors were
chosen [12, 13].
The data produced by the sensors are transmitted to the corresponding receivers
connected to the computer.
The software components that run on the computer don’t have any mobility
requirements, so no mobile technologies were involved so far. The data processing and
storing modules are written in Java [14] and the database is part of the SQL family [15].
4 Data Control System
The software modules that control all the dataflow are divided into several categories,
depending on the roles they play in the system. The Human Computer Interaction
(HCI) is performed using classical peripheral devices (mouse, keyboard, touch screen)
or voice commands.
Some of the modules described below are already implemented and some have to
be developed as the system for practicing the locomotor apparatus recovery techniques
is in development.
The system control software modules are responsible with handling the entire
developed system. Once started, the Graphical User Interface (GUI) will allow the user
to:
– select the functioning mode (tutor or student)
– create or specify a procedure identifier
– enter or exit the data acquisition mode
– start or stop the current acquisition
– confirm the acquisition, store or receive feedback for the current acquired data
The data acquisition software modules begin running once the user desires to use this
facility. Due to the fact the person that performs physical procedures has the hands
occupied with maneuvering the dummy, these software modules can have vocal
commands. They perform the following tasks:
– interrogate the available sensors and open a specific channel for each one of them;
– wait for the spoken “start acquisition” command; once received, the process data
acquisition begins;
– collect the data received from the registered sensors, until the “stop acquisition”
command is pronounced;
– ask the user to validate the current acquisition with “YES”/“NO” pronunciation;
These modules are already implemented, except for the vocal command system.
The data normalization and preprocessing software modules work with acquired
data. This process is not controlled by the user and performs its tasks once the current
acquisition is finished.
– the trailing and finishing noise are eliminated;
– the spike sensory data is eliminated;
– a few signal characteristics are computed: maximum and minimum amplitude,
duration in milliseconds, general average value and specific average values on
signal sequences;
These modules are already implemented.
The database communication software modules store/ retrieve the data in/from the
database and are currently implemented and functional.
5 Experimental Results and Future Development
So far, all the software modules specific to

– data acquisition
– data transmission and reception
– data normalization and preprocessing
– database storage and retrieval
have been implemented.
The physical mannequin has to be developed. First some specific joints and the
simulating adjacent ligaments and tissue will be created. Once these components will
function correctly, an integration will be performed in order to obtain a full body
mannequin. Each sub-ensemble will be able to function and be monitored both indi-
vidually and integrated in the complete dummy.
The signal comparison that has to be performed in order to detect the differences
between the data produced by the teacher and the data obtained from the trainees
(students) will be implemented once the system will be calibrated for different types of
The last step will consist in developing the user interface for controlling and
receiving feedback from the system. The voice commands will also be implemented.
Both the tutor and the student roles will be implemented so the system will be able
to be run as initially intended. The warning signals and the detailed output in case of a
malpractice will be developed so the entire system will serve for teaching the loco-
motor system recovery procedures.
The complex system described in this paper involves a multidisciplinary team
consisting of physical therapists, engineers and rehabilitation doctors.
The team will have to analyze the software patterns that describe various patho-
logical situations.
The software-controlled dummy will have to successfully imitate tissue resistance
and joints stiffness, to signal the pain and the tensions that are too high on the tissues,
depending on the pathology and the time elapsed since the injury occurred.
The dummy will also have to signal the moment when the therapist mobilizes the
segments at non-physiological angles and to react to the pain according to each
patient’s pain level.
In the first phase, the dummy will have to recognize the normal and abnormal
stresses that are exerted on the tissues and joints in cases of joint stiffness and fractures.
Later the mannequin will be completed to recognize the tensions at the muscu-
loskeletal and tissue level. The tension values will also be set according to the asso-
ciated pathology and other complications.
A research project will be implemented and will have the form of a collaboration
grant between the University of Medicine and Pharmacy from Cluj-Napoca and the
Technical University from Cluj-Napoca. Specialists from the medical field will col-
laborate with software engineers in order to finalize the implementation of the sensitive
mannequin that will be used for teaching and practicing the locomotor apparatus
recovery techniques.
References
1. Ishikawa, S., Okamoto, S., et al.: Assessment of robotic patient simulators for training in
manual physical therapy examination techniques. PLoS ONE 10, e0126392 (2015)
2. Silberman, N.J., Panzarella, K.J., Melzer, B.A.: Using human simulation to prepare physical
therapy students for acute care clinical practice. J. Appl. Health 42, 25–32 (2013)
3. Thomas, E.M., Rybski, M.F., Apke, T.L., Kegelmeyer, D.A., Kloos, A.D.: An acute
interprofessional simulation experience for occupational and physical therapy students: key
findings from a survey study. J. Interprof. Care 31, 317–324 (2017)
4. Boykin, G.L.: Low fidelity simulation versus live human arms for intravenous cannulation
training: a qualitative assessment. In: Duffy, V., Lightner, N. (eds.) Advances in Human
Factors and Ergonomics in Healthcare. Advances in Intelligent Systems and Computing, vol.
482. Springer, Cham (2016)
5. Wells, J.: Development of a high fidelity human patient simulation curriculum to improve
resident’s critical assessment. Ann. Behav. Sci. Med. Educ. 29, 10–13 (2014)
6. Friedrich, U., Backhaus, J., et al.: Validation and educational impact study of the NANEP
high-fidelity simulation model for open preperitoneal mesh repair of umbilical hernia (2019)
7. Shoemaker, M.J., Riemersma, L., Perkins, R.: Use of high fidelity human simulation to teach
physical therapist decision making skills for the intensive care setting. Cardiopulm. Phys.
Ther. J. 20, 13 (2009)
8. Leocádio, R.R.V., Segundo, A.K.R., Louzada, C.F.: A sensor for spirometric feedback in
ventilation maneuvers during cardiopulmonary resuscitation training. Sensors (Basel) 19,
5095 (2019)
9. Heraganahally, S., Mehra, S.: New cost-effective pleural procedure training: manikin-based
model to increase the confidence and competency in trainee medical officers. Postgrad. Med.
J. 95, 245–250 (2019)
10. Anatomical Models and Educational Supplies. http://www.mentone-educational.com.au.
Accessed 04 Nov 2019
11. Kim, Y., Jeong, H.: Virtual-reality cataract surgery simulator using haptic sensory
substitution in continuous circular capsulorhexis. In: 2018 Conference Proceedings IEEE
Engineering in Medicine and Biology Society, pp. 1887–1890 (2018)
12. Monnit. https://www.monnit.com/. Accessed 06 Nov 2019
13. Althen Sensors and Controls. https://www.althensensors.com/. Accessed 06 Nov 2019
14. Java. https://www.java.com/. Accessed 06 Nov 2019
15. MariaDB. https://mariadb.org/. Accessed 06 Nov 2019
Pervasive Information Systems
Data Intelligence Using PDME for Predicting
Cardiovascular Predictive Failures
Francisco Freitas, Rui Peixoto, Carlos Filipe Portela,

and Manuel Santos(&)
Algoritmi Research Center, Universidade do Minho, Braga, Portugal

[email protected], [email protected],
{cfp,mfs}@dsi.uminho.pt
Abstract. In the area of Cardiovascular Diseases (CVD), dyspnea, one of many

conditions that can be symptom of heart failure, is a metric used by New York
Heart Association (NYHA) classification in order to describe the impact of heart
failure on a patient. Based on four classes this classification measures the level
of limitation during a simples physical activity. With the use of a non-invasive
home tele monitoring system called Smart BEAT to retrieve biological data and
heart metrics combined with a data-mining engine called PDME (Pervasive Data
Mining Engine) is possible to obtain a different type of analysis sustained by a
real time classification. The connection between the risk factors of CVD with the
accuracy levels in the data models is recognizable, and continuously reflected
with all the scenarios that were created. As soon, the data models used less
CVD’s risk factors variables, the data models become useless, showing us how
connected the risks are to this disease, this sustains the idea that PDME can be
competent data mining engine in this field of work.
Keywords: Data intelligence Pervasive Data Mining Engine Cardiovascular

Diseases
1 Introduction
As reported by the World Health Organization, Cardiovascular Diseases (CVD) are the
prime cause of death worldwide. At 2016 at least *29% died of Ischaemic heart
disease and *10% from a stroke both diseases are deeply connected to CVD which
makes 39% of all the 56.9 million deaths in 2016 [1]. The most used heart disease
treatment protocols and CVD prevention is costly and require continuous visits to a
healthcare facility, which is a big roadblock to the elderly and seniors. These visits can
become a big challenge to the elderly as their health continuous to decrease especially
the ones that suffer a chronic heart failure.
In 2009, the cost of CVD and stroke indirect and direct costs exceed $475 billion in
the US only, direct costs include healthcare, hospitals and nursing home, the indirect
costs associated to lost productivity, caregiver burden, disability, and mortality [2].
Studies showed in 2005 that at least 82% of adults aged above 65 the principal
cause of death are all related to the CVD, with this values spike even more as soon
people get older, CVD are racelessness and genderlessness and at 70 years old the
318 F. Freitas et al.
lifetime risk of having a first Coronary Heart Disease is 34.9% in men and 24.2% in
women [2]. According to data from World Population Prospects: the 2019 Revision, by
2050, one in six people in the world will be over age 65 (16%), up from one in 11 in
2019 (9%). By 2050, one in four persons living in Europe and Northern America could
be aged 65 or over. In 2018, for the first time in history, persons aged 65 or above
outnumbered children under five years of age globally. The number of persons aged 80
years or over is projected to triple, from 143 million in 2019 to 426 million in 2050 [3].
Therefore, with a growth of elderly people in the world and the continuous costs for
CVD treatments and prevention, there is a need to make these tendencies go slower,
and the way that it is possible with the usage of data mining. With the improvement of
technology, is possible to use vital biological parameters such as Electrocardiogram
(ECG), heart rate, systolic/diastolic pressure and temperature can be measured accu-
rately and in real-time by wearable and mobile sensors and transmitted wirelessly to a
gateway device (e.g. smartphone, tablet) [4].
With the collection of this information, it is possible to find out tendencies and
develop prediction models with the help of analysis from medical experts making
decisions in a faster and autonomous way. There is already some Data Mining Engines
being used at in medical or health care centers, but most of them cannot be used
without a DM specialist [5]. The objective of this project is to know if the PDME is a
reliable tool in a medical study and mainly in a CVD data analysis standpoint, to be
introduced to people with less DM knowledge.
2 Background
2.1 Data Mining
The constant evolution of Information Technology (IT) has created a huge amount of
databases and bigger amounts of data in various areas. A new approach started to form,
the usage and manipulation of the data for further decision making [6].
Data mining, is the analysis of factual data or datasets to find uncontested rela-
tionships and to compile the data in unique ways that are both coherent and fruitful to
the data owner [7]. In a DM project the objective is to make discoveries from data, the
main goal is to be as confident as possible about our results which they can reach a
conclusion that is not what we pretended mainly because the presence of uncertainty in
the data. On a DM project we are working with samples of data the we draw con-
clusions from that applies to the universe of data that we gather. DM is knowed as an
undisputed language for dealing with data ambiguity [7].
2.2 Smart Beat

In the last decades, non-invasive tele monitoring in heart failure(NIHT) has moved
from structured telephone calls to remote management systems. In the case of remote
management systems, data from tele monitoring devices is linked into a tele medical
platform and send to the tele medical Centre, hospitals or primary care provider [8].
Data Intelligence Using PDME for Predicting Cardiovascular 319
With the development of newer and even more accurate technologies, Fraunhofer
developed their own NIHT system called SmartBEAT being Fraunhofer, Portugal
Association the Coordinator of the project [9].
SmartBEAT is a smartphone-based HF NIHT system designed to detect congestion
through daily monitoring of HF symptoms, weight, peripheral blood oxygen saturation,
blood pressure, heart rate, physical activity, and therapy adherence. This solution
comprises a monitoring device and a system to collect, analyze, store, manage and
transmit data [9].
2.3 PDME
In the present day the growth of data mining engines and data mining in general creates
a new era of data where everything can be used to obtain information, tendencies and
patterns, but all of them require the knowledge of DM concepts [5].
Pervasive Data Mining Engine (PDME) [5], simplifies the knowledge with the
connection of general characteristics present in a data mining engine with pervasive
computing, which means putting all the technological needs on the “background”,
making the users just provide the data, since PDME has a fully automatic configuration
and manual data mining services, the user can choose the one that wants, both of them
are simple to use and notoriously logical and innate for the user [5]. Providing all the
DM tools and their outcome mainly in dashboards and probabilities in real-time at any
place and time, makes PDME a truly powerful tool in data mining in general especially
in data mining engines [5].
2.4 Risk Factors in CVD

Cardiovascular Diseases (CVD) hits all the world population, and there is a continuous
need to be researched, the emphasis on the research are mainly on the risk factors and
their impact in predictive ability mainly for the disease prevention [10].
CVD is not just a singular type of disease, but a gather or a cluster of diseases that
affect the cardiovascular system that are mainly caused by atherosclerosis, which in
other words is the deposition of fat and calcium plaques inside the arteries, that end up
hindering blood circulation, mainly in the heart and the blood vessels [11].
The concept of “risk factors” in CVD became public in 1957 by the Framingham
heart study(FHS), in 1957, FHS already showed the correlation of heart diseases with
cigarette smoking, high blood pressure, and high cholesterol levels, the findings were
truly inspiring that changed the way the field is practiced and their prevention:
Diabetes - Diabetes is a condition that influences a person’s capability to sustain an
amount of glucose in the blood with two types, type I and type II [12]. Both types can
lead to an increase of CVD that is connected with high cholesterol levels,
atherosclerosis and hypertension and even insulin resistance is related to CVD since
an adult with diabetes has two to four times more likelihood to die from heart disease.
[12];
Obesity - Obesity enhanced the probability of having CVD even if there are no other
risks attached, the excess of weight increases the tightness in the heart raising blood
pressure and cholesterol and higher triglyceride levels. These factors can increase the
risk of having, atherosclerosis and thrombolytic embolism that are CVD [12];
Blood pressure - Mainly knowed as hypertension, is deeply linked to cardiac diseases,
high blood pressure affects the heart by thickening and stiffening him, making harder
to the blood to flow, and that can lead to heart attacks and strokes [12].
All these CVD’s risks have symptoms or pathologies that are deeply inherent on the
disease causing conditions on the people affected by them, and it is important to
define them, to make diagnosis easier or to be easily associated to a specific branch of
CVD’s;
Syncope - is a condition defined as a self-limited loss of consciousness without
maintain ability to sustain a postural tone, followed by spontaneous recovery [13].
Orthopnea - is having the sensation of breathlessness in any situation [13].
Dyspnea - is a condition about shortness of breath in relaxed positions [13].
Edema - is swelling caused by excess fluid trapped in your body’s tissues [13].
With the relation between the symptoms and results is possible to predict a classifi-
cation system to facilitate a diagnosis in our case is the NYHA classification.
2.5 Related Work

The former warning promontory system for Cardiac Heart Failure (CHF) used bio-
metric data such as weight and blood pressure, the system combined the biometric data
with a set questions making at the end a prediction level for CHF [14].
After a few years with a help of an original set of rules, was possible to predict and
detect low-risk patients with heart failure with the usage of classification trees using a
large data set. In this dataset there was a various types of variables focusing mainly in
demographic, clinic, laboratory, radiographic data, the outcome was to predict low risk
patients based on these type of data [15].
Then a blood pressure monitor system called WANDA was created using an
automated a vital sign monitor system, through a weight and activity track record [16].
Finally, the implementation of a data analysis system that collects health data from
a user’s home through the use of various sensors (heart rate, oxygen saturation and
body temperature sensor) was developed. These sensors send the information wire-
lessly to the user’s mobile phone and use accelerometers placed on the patient to
monitor potential falls [17].
3 Material and Methods
3.1 Material Used

For this project the tools used were:
• Microsoft SQL Server 2014, for the transformations of the data set, selection the
desired data through creation of various views.
• PDME (Pervasive Data Mining Engine) - the main tool used in this project, to
make, analysis, predictions and data models.
3.2 Research and Data Mining Methodologies

The research methodology that was used in this project was the Design Science
Research (DSR). Communicating and understanding are the benchmarks of DSR
mainly because it has the acknowledgment of an Information Systems professional,
also to obtain the credibility needed of the DSR in information systems, since DSR is
present in various fields of engineering [18]. The goal of DSR is to create technologic
solutions in our case is to find patterns or tendencies in CVD, using data models. The
artifact must be evaluated with quality effectiveness so it needs to be well executed,
providing a clear contribution in the areas of the design and methodologies, with
rigorous methods in the creation and evaluation of the artifact. Using research method
that satisfies the laws within the problem that is inserted, the results need to being
represented in scientific manner, meaning the through articles reports or presentations
[19]. For the elaboration and development of this data mining process, the approach
chosen was CRISP-DM, this process was selected mainly because it can be a com-
plementary process for the DSR, and it is the standard approach to a DM project [20].
4 Data Study
4.1 Data Understanding

To better understanding the variables, we created a data dictionary:
Table 1. Dictionary of data

Data attribute name Values by day Range of values
BMI Body Mass Index 20 to 41
nvalHR Number of Heart rate records 12 to 1378
vmaxHRminHR Variance between Max and Min heart rate 38 to 175
valBP Blood Pressure records 45 to 121
diastolicBP Diastolic records 47 to 108
systolicBP Systolic records 75 to 180
vsysdias Variance between the systolic and diastolic 15 to 96
Q1 Syncope on previous day? 0(no) or 1(yes)
Q2 Orthopnea on previous day? 0(no) or 1(yes)
Q3 Dyspnea on previous day? I to IV
Q4 Edema on previous day? 0(no) or 1(yes)
medicine Is the patient is taking medication? 0(no) or 1(yes)
fragility Fragility of the heart? 0(no) or 1(yes)
diabetes Does the patient has diabetes? 0(no) or 1(yes)
The attributes showed in Table 1 can be used to predict a classification system,

called New York Heart Association (NHYA) classification that allocates patients in one
of four categories, based on their limitation during a simple physical activity. On
Table 2 we can see the relation between the categories of the classification and the CV
condition type [21].
Table 2. Relation between NYHA classification and Cardiovascular Condition type

NYHA Results CVC Condition
Class I No physical limitations A No heart failure
Class II Small physical limitations B Minor heart failure
Class III Severe physical limitations C Systematic heart failure
Class IV Unable to perform any physical activity D Advanced systemic heart failure
4.2 Data Preparation

The data that was used in this project was obtained via SmartBEAT, Hospital de S.
João in Porto was the location of the extraction of the data, the dataset came in a. JSON
object and it was converted to .CSV file contained 4 data sheets (Measures, Medica-
tions Prescription, Profile and Questionnaire Responses).
4.3 Modeling
After the separation of all the tables to work in the data transformations, we first
transformed the data to a DATE type syntax (dd/mm/yyyy). The timeline of the data
starts in 15 of February and it ends 28 of March of 2019. The main point was to define
the data granularity because there is various types of data granularity,
After all transformations we have a dataset with 691 rows in a daily granularity
standpoint, in Table 3 and 4 is the patient data and the dyspnea level records by
frequency.
Table 3. Patient’s data Table 4. Dyspnea levels in %

Patients (n = 42) In frequency
Age (years) 54.8 ± 10.7 NYHA I 35%
Female sex 33% NYHA II 51%
Diabetes 21% NYHA III 12%
In Medication 38% NYHA IV 2%
Fragility in Heart 10% in level 1
BMI in Women 29.77 ± 5.22
BMI in Men 28.97 ± 3.63
5 Evaluation
The strategy that was used was essentially to show initially the ability of the prediction
models to be accurate on predicting the severity levels of dyspnea by showing the clear
relationship between the risk factors, having a minimum, maximum and average per-
centage accuracy level in a 10-fold (10 sets of model training and test);
Scenario 1 - In this scenario we used metrics that are prevalent in heart failure
patients.
Attributes used; q3(target), age, fragility, BMI
Technique used: Caret_C50 / Results: min: 83,1% avg: 91.4% max: 97.6%
Table 5. Accuracy levels in Scenario 1

Levels I II III IV
Accuracy by % 87,5 97,2 100 100
In the Table 5 we can see a accuracy average off 91%, we can conclude that the
data is in a universe of patients that are already ill and using attributes that are char-
acteristic of people with this disease it is normal to have great results not because of the
ability of the model but because of the data.
Scenario 2 - In this scenario we decided that it would be important to test all
variables in our dataset, not to draw big conclusions from a medical stand point but to
see how models behave with many variables and which ones stand out on the models
Attributes used; q3(t), age, avgHR, diabetes, diastolicBP, fragility, gender, BMI,
medicine, q1, q2, q4, vmaxHRminHR, vsystodias.
Technique used: randomUnionForest / Results: min: 82,7% avg: 91.7% max:
96.6%

Levels I II III IV
Accuracy by % 87,5 100 100 100
Table 6 shows that, with all attributes the models had no problem predicting as they
also had about 92%, we can take from here that there are certain attributes that are
important to the data and the models discard the rest, one solution will be to see the
weight or importance of each attribute.
Table 7. Importance of attributes in Scenario 2

Importance BMI vmaxHRminHR vsystodias age diastolicBP avgHR fragility
by % weight 100 83 83 34 33 23 23
The Table 7 reveals how the models focuses on BMI and use this parameter as the
main value to predict the level of dyspnea, which demonstrates the relationship of BMI
with people with heart failure. Scenario 3 - After the first 2 scenarios, we decided to
explore variables that are common to all people, that can be tracked easily, for this to be
possible we remove variables that are hard to obtain and also remove variables with
bigger importance like BMI.
Attributes used; q3(t), age, avgHR, diastolicBP, gender, vmaxHRminHR,
vsystodias
Technique used: Caret_C50 / Results: min: 75,6% avg: 85.1% max: 92.8%

Levels I II III IV
Accuracy by % 83,3 97,2 87,5 100
In the Table 8 we can see that from the moment we remove the most important
variables the accuracy levels start to decrease, in this scenario we have all the data
concerning the users’ hearts and only the age and gender. The models are still able to
correctly demonstrate the PDME’s capacity, its data versatility, as in our opinion is the
best model, Table 9 gives us the set of rules created by the data model:
Table 9. Set of rules created by the data model in Scenario 3

LEVEL I LEVEL II LEVEL III LEVEL IV
59 < age 62 42 < age 48 61 < age > 65 42 age > 57
gender = FEMALE avgHR 57.9 gender = MALE gender = FEMALE
avgHR 61.1 vmaxHRminHR > 71 61 diastolicBP < 73 64.5 < avgHR > 76.4
diastolicBP 63 v_systo_dias > 37 v_systo_dias 39 vmaxHRminHR > 90
Scenario 4 - In these scenario only data about the heart is used, we removed most of
the variables to show the volatility of the data when we exclude the heavier variables.
Attributes used; q3(t), avgHR, diastolicBP, vmaxHRminHR, v_systodias
Technique used: randomUnionForest / Results: min: 14,9% avg: 31.1%
max: 40.6%

Levels I II III IV
Accuracy by % 54,1 88,8 14,2 0
The scenario 4 exhibited in the Table 10 has the lowest accuracy off all the sce-
narios, we can see the liability of the models to obtain good results, because this data is
dependent of the patient’s age and the patient’s gender, and the cardiovascular
parameters deviates both with gender and age.
6 Deployment
The data that we obtained was from a group of people with some level of heart failure,
so it was natural to see the scenarios with great levels of accuracy, because the data is
deeply connected to all the cardiovascular risk factors.
Initially, the results on first scenarios, PDME showed the capacity of defining the
most important variables how they were connected to the NYHA classification, it
shows the relationship between them and how they affect the data models via their
weight importance in the data mining models. Secondly, the way the models were
starting to be worse as soon we started to create scenarios with lesser CVD risks
attributes continues to show how some variables are really important for the prediction
levels. But the most important thing is the data models corroborates the science behind
this disease showing that the tendencies and patterns were deeply rooted to the medical
concepts. When we had BMI on the scenarios or even the main risk factors the
accuracy levels were truly accurate, but as soon we removed this factors the value starts
to decline, without them it is impossible the obtain some sort of good analysis or
prediction. Finally, we can say that PDME is a data-mining engine competent on this
field of study and in these type of datasets.
We can say that PDME can be used in cardiovascular disease prediction standpoint,
developing patterns and tendencies in this area. There are some difficulties in this work
most of them are connected to the complexity of the CVD, it is important to have a
grasp of knowledge in this area to capitalize all the potential that PDME gives.
Surely, with better knowledge of this disease is possible to obtain improved clinical
significance and even breakthroughs in this area of medicine with the possibility to be
used in real life environment. This work opens a new research field in the medicine area
applied to the early detection of CVD. Future work will be focused in exploring more
complex and extensive datasets. A deep analysis is need to apply the same concepts but
using more data and attributes. After that create prediction models that can help doctors
to predict automatically levels of a heart failure of a patient or to be used as a pre-
vention tool for this type of disease.
Acknowledgments. This article is a result of the project Deus Ex Machina: NORTE-01-0145-

FEDER-000026, supported by Norte Portugal Regional Operational Program (NORTE 2020),
under the PORTUGAL 2020 Partnership Agreement, through the European Regional Devel-
opment Fund (ERDF). The work has been supported by FCT – Fundação para a Ciência e
Tecnologia within the Project Scope: UID/CEC/00319/2019.
References
1. World Health Organization: Global Health Estimates 2016: Deaths by Cause, Age, Sex, by
Country and by Region, 2000–2016. WHO, Geneva (2018)
2. Yazdanyar, A., Newman, A.B.: The burden of cardiovascular disease in the elderly:
morbidity, mortality, and costs. Clin. Geriatr. Med. 25, 563–577 (2009)
3. United Nations: World Population Ageing 2019 - Highlights. United Nations (2019)
4. Lappa, A., Goumopoulos, C.: A home-based early risk detection system for congestive heart,
Patras, Greece (2019)
5. Peixoto, R.D.F.: Pervasive data mining engine, Guimarães (2015)
6. Ramageri, B.M.: Data mining techniques and applications. Indian J. Comput. Sci. Eng. 1,
301–305 (2010)
7. Mannila, H., Smyth, P., Hand, D.: Principles of Data Mining. The MIT Press, Cambridge
(2001)
8. Koudstaal, S., Asselbergs, W., Brons, M.: Algorithms used in telemonitoring programmes
for patients with chronic heart failure: a systematic review. Eur. J. Cardiovasc. Nurs. 17,
580–588 (2018)
9. Cardoso, J., Moreira, E., Lopes, I.: SmartBEAT: a smartphone-based heart, Porto (2016)
10. Sullivan, P.L.: Correlation and Linear Regression. Boston University School of Public
Health. http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Correlation-Regression/
BS704_Correlation-Regression_print.html
11. INS Português Doutor Ricardo Jorge: Doenças Cardiovasculares (2016)
12. Mukerji, V.: Clinical Methods: The History, Physical, and Laboratory Examinations.
Butterworth-Heinemann, Boston (1990)
13. Nason, E.: An overview of cardiovascular disease and research (2007)
14. Su, J.: Developing an early warning system for congestive heart failure using a Bayesian
reasoning network. Doctoral dissertation, Massachusetts Institute of Technology (2001)
15. Auble, T.E.: A prediction rule to identify low-risk patients with heart failure. Acad. Emerg.
Med. 12, 514–521 (2005)
16. Visweswaran, S., Angus, D.C., Cooper, G.F.: Learning patient-specific predictive models
from clinical data, University of Pittsburgh (2010)
17. Varma, D., Shete, V., Somani, S.B.: Development of home health care self. Int. J. Adv. Res.
Comput. Commun. Eng. (2015). https://www.ijarcce.com/upload/2015/june-15/IJARCCE%
252054.pdf&ved=2ahUKEwipn8Xz_vjoAhU6DGMBHdpRA8kQFjAKegQIBhAB&usg=
AOvVaw3jCeLDzja9paQTjBoiRxoK
18. Alturki, A., Bandara, W., Gable, G.: DSR and the core of information systems (2012)
19. Hevner, A.: Design science in information systems research (2004)
20. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.:
CRISP-DM 1.0. CRISP-DM Consortium, p. 76 (2000)
21. New York Heart Association: Specifications Manual for Joint Commission National Quality
Measures (2016)
Design of a Microservices Chaining
Gamification Framework
Ricardo Queirós(B)
ESMAD, Polytechnic of Porto, CRACS/INESC TEC & uniMAD, Porto, Portugal

[email protected]
Abstract. With the advent of cloud platforms and the IoT paradigm,
the concept of micro-services has gained even more strength, making
crucial the process of selection, manipulation, and deployment. However,
this whole process is time-consuming and error pruning. In this paper,
we present the design of a framework that allows the chaining of several
microservices as a composite service in order to solve a single problem.
The framework includes a client that will allow the orchestration f the
composite service based on a straightforward API. The framework also
includes a gamification engine to engage users not only to use the frame-
work, by contributing with new microservices. We expect to have briefly
a functional prototype of the framework so we can prove this concept.
Keywords: Microservices · Gamification · Framework
1 Introduction
Nowadays, enterprise systems are mostly based on loosely coupling interoperable
services – small units of software that perform discrete tasks – from separate
systems across different business domains [1]. A crucial aspect in this context is
service composition. Service composition is the process of creating a composite
service using a set of available Web services in order to satisfy a user request
or a problem that cannot be satisfied by any individual Web service [1]. The
service composition can be defined from a global perspective (choreography) or
using a central component that coordinates the entire process (orchestration).
However, despite all the software that implements those concepts, there are few
that can be used, in a very simple way, and focused on the new paradigm of
cloud services, where the JSON specification increasingly assumes a prominent
role in the data exchange formalization within the client-server model.
This paper focuses on the design of a framework that aims to integrate a set
of micro-services to solve a particular problem. In order to engage users to use
the platform based on the framework, a gamification engine is injected which
will perform tasks such as grading micro-services and their aggregation.
The framework is composed of 3 big components: an authoring tool, a gamifi-
cation engine, and a Web client engine. The former allows the submission, chain-
ing, sharing and grading of the micro-services. The gamification engine allows
328 R. Queirós
the user to grade micro-services and their aggregation as a Web Service Con-
tainer (WSC). The later, allows developers to iterate over all the micro-services
in a service container, through a simple API. The main advantage of this app-
roach regarding the existent approaches is the simplicity and the separation of
concerns. Firstly, a service container is formalized with a simple JSON schema
that can be loaded from the cloud. Then, developers through a simple API can
manage the execution flow of the process without worrying about HTTP clients
and messages. Secondly, the framework fosters the separation of concerns by
giving to the developer the mission to formalize and submit micro-services and
interact with the engine and to the business analysts the chance to chain the
micro-services and generating containers to address a particular problem. The
chaining process will be very simple, using drag-and-drop techniques, and will
automatically inform of invalid pairings based on the matching of the services’
response/request types.
This work is organized as follows. Section 2 discusses some key concepts
regarding service composition, namely, choreography and orchestration. In
Sect. 3 we present the framework that was designed to help developers/annalist
to compose and interact with Web services. Section 4 evaluates the proposed
framework through the creation of a prototype for an healthcare case study.
Finally, we enumerate the main contributions of this work and future directions.
2 State of Art
One of the hot topics in this paper is service composition. This concept encour-
ages the design and aggregation of services that can be reused in several sce-
narios. The next section focuses on the two most used techniques: orchestration
and choreography.
2.1 Service Composition
The design and aggregation of services can be achieved mostly by two

approaches: orchestration and choreography Fig. 1.
Fig. 1. Service orchestration and choreography

Design of a Microservices Chaining Gamification Framework 329
Web service orchestration is a type of service composition where specific

web service business processes are controlled by a central component. This com-
ponent coordinates asynchronous interactions, flow control, and business trans-
action management [5]. Typically, Business process modeling notation1 (BPMN),
maintained by the Object Management Group, is used to define a visual repre-
sentation of the flow and business process execution language (BPEL) is used to
write the code that executes the services.
Another approach is called Web Service choreography is a form of service
composition in which the interaction protocol between several partner services is
defined from a global perspective. This could be mapped for the dance domain,
where “dancers dance following a global scenario without a single point of con-
trol” [3]. In other words, at run-time, each participant in a service choreography
executes its role according to the behavior of the other participants [1]. Sev-
eral languages specifications appeared to model service choreographies: the Web
Service Choreography Description Language2 (WS-CDL) and the Web Service
Choreography Interface3 (WSCI). Both are XML-based specifications from the
W3C for modeling choreographies. The BPMN version 2.0 includes diagrams to
model service choreographies. Other academic proposals for service choreography
languages include: Let’s Dance [6], BPEL4Chor [2] and Chor4 .
A distinction is often made between orchestration (a local view from the
perspective of one participant) and choreography (coordination from a global
multi-participant perspective, without a central controller). Although the ser-
vice orchestration is the most used and plays an important part in a service-
oriented architecture (SOA), the web service choreography is also often used to
address the typical issues of single-point-of-failure that are often found using the
orchestration paradigm [5].
2.2 Related Work
The framework proposed in this paper has some similar aspects with several
automation pipeline tools that exist in the Web. For instance, in the field of ser-
vice automation tools, several tools appeared in recent years. The best examples
are IFTTT5 , Pipes6 , Node-RED7 and SOS [4].
IFTTT is a free web-based service which is mostly used to create chains of
simple conditional statements, called applets.
Pipes is a visual programming editor specialized on feeds that gives a UI with
blocks that can fetch and create feeds, and manipulate them in various ways such
as filtering, extracting, merging and sorting. The user only needs to connect those
1
http://www.bpmn.org/.
2
https://www.w3.org/TR/ws-cdl-10/.
3
https://www.w3.org/TR/wsci/.
4
http://www.chor-lang.org/.
5
https://ifttt.com/.
6
https://www.pipes.digital/.
7
https://nodered.org/.
330 R. Queirós
blocks with each other so data can flow through such a pipe, flowing from block
to block. The final output is a news feed, which can be served to other programs
that support open web standards. As input formats Pipes supports RSS, Atom,
and JSON feeds, it can scrape HTML documents, and it can work with regular
text files.
Node-RED is a flow-based development tool for visual programming devel-
oped originally by IBM for wiring together hardware devices, APIs and online
services as part of the Internet of Things. Node-RED provides a web browser-
based flow editor, which can be used to create JavaScript functions. Elements of
applications can be saved or shared for re-use. The runtime is built on Node.js.
The flows created in Node-RED are stored using JSON.
Simple Orchestration of Services (SOS) is a pipeline service environment that
has only a logical architecture defined (without any functional prototype). The
goals of SOS are to abstract developers of the burden of dealing with HTTP
bureaucratic aspects and to centralize a set of services as tasks allowing its
composition in a bigger service that can be used by a Web client.
3 Micro-services Chaining Gamification Framework

This paper presents the design of a gamification framework which will help
developers/analysts to submit/chain microservices in order to create composite
services.
3.1 Architecture
In Fig. 2 is presented the architecture of the proposed framework.
The architecture is composed of the following components:
– The editor - Web-based component with a GUI for the submission, chaining,
and generation of composite services;
– The manifest builder - acts as a component responsible for the conversion
and serialization of the final chain in a downloadable format;
– The gamification engine - a component which will retain data related to
productivity and challenges. The engine should have the chance to commu-
nicate with GBaaS (Gamification Backend as a Service);
– The API - interface exposing all the actions which users can do to interact
with a particular instance of the framework;
– The client - a client component responsible for the use and potential orches-
tration of the composite service.
3.2 Editor
The editor is a Web-based component that will help users in the submission and
aggregation of microservices. The final result is a new composite service as a
Web manifest that can be stored in the cloud or saved in the user’s computer.
A user can perform the following operations in the editor:
Fig. 2. Service orchestration and choreography
– Submit a new microservice

– Create a new composite service (by aggregating microservices)
– Share and grade microservices and composite services
After the registration and, before any submission, developers can search for
desirable microservices using tags. For instance, if we want to use a service for
weather consumption, we insert the tag “weather” and the editor shows all the
microservices with the related tag. If no services are returned, the developer can
submit a new microservice. The submission of a microservice can be performed
interacting with the GUI component or by submitting a valid JSON document
(working now on formats).
The aggregation of microservices should be done through the editor using a
drag-and-drop. The aggregation of two microservices should fail if the microser-
vices are not eligible for pairing. The eligibility is granted only if the two
microservices (for instance, microservice A and microservice B) have a response
and request types similar. In other words, we cannot pair microservice A that
returns date with microservice B that are expecting an integer as input. In the
end, we will have a composite service as a Web manifest that can be stored in
the private cloud of the user or downloaded for its computer.
Another feature of the editor is the capacity for sharing and grading microser-
vices. This feature will allow a user to share a previously created microservices
in the public space of the community. With the grade feature, developers could
score a given microservice taken into account the experience that they have
with it. This grading will influence the results list after searching and fed the
gamification engine for other purposes such as badges and rankings.
332 R. Queirós
3.3 Web Manifest

After the chaining process, the editor will produce a manifest (a new JSON
document) with all the selected task instances included. The manifest structure
is very simple and can be accessed by the SOS engine through a URL endpoint
or with a local reference. The manifest will have all the information needed by
the client to orchestrate the composite service.
3.4 Web Client

The client is a software component that acts as the orchestrator and its main
goal is to manage the execution flow of the composite service and to free the
developer of the need to handle HTTP clients and parsing requests and response
messages.
In practical terms, the client receives a Web manifest (from a local reference
or a cloud location), maps the JSON data to a set of objects and iterates over
them. At the same time, it provides a simple API so the developer can manipulate
all the microservices in the composite service. The engine will support two run
modes:
– Automatic mode - executes all the microservices automatically, picking the
response of the current microservice and inject it in the request of the follow-
ing microservice. In the end, it returns the response of the last microservice;
– Iterative mode - performs one microservice at a time. For each microservice,
the developer can inject values to map with placeholders. Use this mode when
the inputs of the app users are crucial for the flow of the service.
4 Conclusion
In this paper, we present the design of a framework as a tool for service com-
position. The main idea is to use a Web editor to aggregate small microservices
and chain them in a composite service. These composite services can be loaded
in a client library that will manage the execution flow of the composite ser-
vice and abstract the developer of all the bureaucratic aspects related to HTTP
messaging management.
The main contribution of this work is the design of a services chaining frame-
work which includes the interaction of several components.
As future work we intend to:
– Define the formats for the microservices, the composite services, the mani-
fests, and the API;
– Create a prototype by choosing a specific domain and by implementing all
these components;
– In the editor, we intend to include visual programming constructs such as
conditional and cyclic blocks.
Acknowledgment. “This work is financed by National Funds through the Por-

tuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia within project:
UID/EEA/50014”.
References
1. Foster, H., Uchitel, S., Magee, J., Kramer, J.: Model-based analysis of obligations
in web service choreography. In: Advanced International Conference on Telecom-
munications and International Conference on Internet and Web Applications and
Services, AICT-ICIW 2006, pp. 149–149, February 2006
2. Leymann, F. Decker, G., Kopp, O., Weske, M.: BPEL4Chor: extending BPEL for
modeling choreographies. In: 2007 IEEE International Conference on Web Services,
pp. 296–303 (2007)
3. McNeile, A.: Protocol contracts with application to choreographed multiparty col-
laborations. Serv. Oriented Comput. Appl. 4(2), 109–136 (2010)
4. Queirós, R., Simões, A.: SOS - simple orchestration of services. In: 6th Sympo-
sium on Languages, Applications and Technologies, SLATE 2017, Vila do Conde,
Portugal, 26–27 June 2017, pp. 13:1–13:8 (2017)
5. Chao, C., Hongli, Y., Xiangpeng, Z., Zongyan, Q.: Exploring the connection of chore-
ography and orchestration with exception handling and finalization/compensation,
pp. 81–96. Springer, Heidelberg (2007)
6. Zaha, J.M., Barros, A., Dumas, M., ter Hofstede, A.: Let’s dance: a language for
service behavior modeling, pp. 145–162. Springer, Heidelberg (2006)
PWA and Pervasive Information
System – A New Era
Gisela Fernandes, Filipe Portela(&), and Manuel Filipe Santos
Algoritmi Research Centre, University of Minho, Guimarães, Braga, Portugal

{cfp,mfs}@dsi.uminho.pt
Abstract. Nowadays, and increasingly, the users’ demand over the applications
requires them to be a lot more flexible, adaptable, capable of being executed
over different operational systems. This happens according to the need of
accessing those applications no matter where, when, nor what they use to do so.
Well, this is the basis under the concept of Pervasive Information Systems (PIS).
But, how can such complex thematic be handled? How can an application be
developed towards global usage? A new development methodology has been
arising, Progressive Web Application (PWA), by mixing the web pages with the
mobile applications world. So, in a nutshell, PWAs appear as a concretization of
what is PIS concept. This article aims to explore this thematic and provide a few
insights on what is PWA, what are its strengths, weaknesses, opportunities, and
threats.
Keywords: Progressive Web Applications (PWA) Pervasive Information

System (PIS) Multi-platform
1 Introduction
For a long time now, it is possible to see the growth of the variety of devices that
people use on their daily basis. That became even more intense with the implemen-
tation of the Internet of Things (IoT) concept, described as a model that allows different
devices (the “things”) to be connected as part of the internet [1]. These “things” are
used to capture several kinds of data related to the environment they are into or related
to human beings that use them. In 2018, around 23.14 billions of devices were part of
this IoT universe worldwide, and it is estimated that in 2025 this value might triplicate
reaching 75.44 billion [2]. Taking this as the start point for this article becomes clear
the enormous variety of user interfaces for which many applications must be prepared.
With this and the paired technology evolution, a context-aware becomes necessary
[3], along with the need to develop solutions towards their combination with another
IoT solutions, which implies them to be flexible and modular but also with a strong
core architecture [4].
All of this meets the Pervasive Information Systems (PIS) concept since it is based
in “non-traditional computing devices that merge seamlessly into the physical envi-
ronment” [5].
So, by now, it is understandable that thinking in cross-platform development might
be the most efficient way of dealing with such variety. Therefore, intersecting mobile
PWA and Pervasive Information System – A New Era 335
apps and the web becomes a good way of achieving that [4]. With this paper it is
intended to approach how mobile computing intersected with software engineering
helps to materialize the PIS concept, in this case, using PWAs, demystifying and
clarifying the differences between the usual Desktop Information Systems (DIS) and
Pervasive Information Systems (PIS). Further in the article, is included a SWOT
analysis over PWA towards PIS.
The present paper is subdivided into five main sections, starting with the present
Introduction, providing a few insights on the field of study along with an overview of
what is expected from this paper. Next to it appears the Background section in which
the concepts needed to understand thematic here discussed (Information Systems, and
Pervasive Computing), followed by a section that aims to cross these two concepts.
Then a new section, entirely dedicated to explaining Progressive Web Application (the
idea, why use this approach based on its characteristics). And at the end of this paper, it
is presented a set of conclusions on the studied thematic (in the Conclusions section).
2 Background
Before getting deeper into this thematic, there are some basic concepts that must be
fully understood, such as Information Systems, Pervasive Computing, and their merged
result, Pervasive Information Systems. The Information Systems concept can be easily
understood if thought as the result of four components: Input, Process, Output, and
Feedback, as shown in Fig. 1.
Fig. 1. Information systems (adapted from [6]).
In a nutshell, an Information System can be seen as flow of collecting (gather and

capture) raw data (represented by input component), that would be submitted to some
kind of manipulation, being that conversions or transformations (in the process com-
ponent), resulting in the storage and spread of data (being that the output component)
that aims to inflict corrective reactions over something towards a predefined goal
(feedback mechanism) [6].
So, according to Kurkovsky [7], Pervasive Computing can be defined as the idea
that “computing equipment will grow smaller and gain more power; this would allow
small devices to be ubiquitously and invisibly embedded in the everyday human sur-
roundings and therefore provide easy and omnipresent access to a computing envi-
ronment”. Understanding this ideal makes it easy to recognize the main challenges
336 G. Fernandes et al.
associated with this research field as minimizing the impact that these systems might
have on user’s perception and how to make a system invisibly built-in the environment.
Regarding Pervasive Information Systems, it is crucial to define it clearly before
proceeding. So, according to Kourothanassis, P. and Giaglis, G. [8], a pervasive
information system can be defined as “Interconnected technological artefacts diffused
in their surrounding environment, which work together to sense, process, store, and
communicate information to ubiquitously and unobtrusively support their users’
objectives and tasks in a context-aware manner”.
In other words, a PIS is a highly embedded system, at the point users might not
notice that they are using it, has a constant presence through different types of devices
(which have no formal obligation of being participants of a concrete network requiring
the system to support spontaneous networks), and receiving stimuli from the envi-
ronment not necessarily form the user (context-awareness) [5, 9].
In Table 1, it is expected to clarify some of the differences between what is
Information Systems and what is Pervasive Information Systems. This way becomes
easier to understand the additional complexity of PIS when compared to IS, and the
reason for some problems and question that it raises.
Table 1. Information Systems vs. Pervasive Information Systems.

Dimension Information Systems Pervasive Information Systems
User profile Known, trained, office clerk Unknown, untrained, citizen
Devices Personal computer Multiple artifacts
Context Perceive user input Perceive context information
Services Stationary services Support mobility
Task user Specified Generic
Medium Localised, homogeneous, point Nomadic, heterogeneous, multimodal
interaction and click interaction
Space Cybernetic, fix Physical, mobile
interaction
Time use Reactive Proactive
systems
3 PIS vs DIS
As it was mentioned before, this concept appears as the conciliation of Pervasive

Computing and Information Systems. This turns clear that PIS has a set of charac-
teristics that turns itself quite different from the usual Desktop Information Systems
(DIS). Therefore, those are systems that must include devices other than the usual
desktop (and inputs other than the common mouse and keyboard), and in which
machine events might be triggered according to the perception that the machine has
over the environment in which it is inserted, instead of being triggered by humans [5].
Some dimensions can be defined by identifying the differences between PIS and
DIS [5, 8] (Table 2):
Table 2. Dimensions comparison.

Dimension DIS PIS
User: the type of People who have as their The common citizens, which
people who will functions work with the system. might use the system only once,
use the system Therefore, uses it regularly, is are complete strangers to it and
known, and trained to work with have no training to handle it
it
Task: the type of Generic tasks that are focused on Specific tasks that are focused on
tasks the system their utility and productivity results delivery and user
have experience
Medium: the Constant localization, with “Everywhere”, continuous
type of tasks the homogeneous components, and presence, through different
system has uses “point and click” paradigm devices (heterogeneous), uses
(based on user-specific requests) multimodal paradigm and has a
fluid interaction
Space: “where” Its interactions happen just in a Brings the physical world into the
the system is held cybernetic context system
up
Product: type of The output is virtual The output is tangible and virtual
system outputs
Time: “when” It is reactive, execute something It is proactive, perceives events in
does the system when is asked to do so its environment and acts
play its role accordingly
Considering all these differences, it is understandable that the modulation of this

kind of systems must be adjusted accordingly. The main difficulty regarding this matter
is related to the context-awareness property underlying the PIS concept.
In [10], some characteristics of the context information are introduced as crucial to
be handled in order to design the system requirements: imperfection (the information is
affected by an elevated number of variable which can conduct to errors on it), temporal
characteristics (information has an intimate relationship with a timeline), alternative
representations of similar informations due to the variety of components (devices), and
highly interrelated, by other words, there are a lot of relationships that can be estab-
lished between people, devices, and communication channels. So, all these context
characteristics must be addressed carefully while modelling a PIS with all its users,
devices, channels, associations, and dependencies, increasing the modulation com-
plexity when compared to a DIS.
4 Progressive Web Applications
Now that it is clarified how in what a pervasive information system consists, it is time
to understand how it can be concretized, this means, pass the concept into an actual
usable system. To solve this question appeared a new way of dealing with the web
world along with the applications’ one, trying to combine them in the best possible
way, the Progressive Web Applications (PWA) [11].
Taking a look into this approach name fragments, it became easier to understand its
goals and coverage. So, by being progressive it is secure to assume that they evolve
across time according to their usage, the web concept indicates that they are built using
web models, and app that they count with the typical app features [11].
So, PWAs are optimized, and reliable web apps accessible on the web, counting
with its best parts, such as a wide reach, instant access and updates, and an easy
shareability. On the other hand, it also includes offline storage, access to native fea-
tures, making them indistinguishable from native apps at most of the times [12].
In a nutshell, PWAs main goal is to provide to its users a similar experience while
using a web app, either through the browser, either as a mobile application [13].
4.1 Why Use Progressive Web Apps?

Being PWAs concept explained, it is important to understand why it is profitable to use
it when compared with other approaches, and what are the main characteristics that
they envolve. Before, exploring these thematics, it must be realized that this approach,
like any other, has its ups and downs, and its usage must be considered according to the
reality in which it will be inserted. Therefore, another question must be asked, “When
should PWAs be used?”, well, it must be considered while a website is being delivered
with a high expectancy of being used on a phone or tablet on the move, with the
possibility of losing network connectivity. Its implementation must be cautiously
thought when it is expected to be installed in an IOS system, due to the limited support
and access [13]. The fact that PWAs provide to its users a seamless app experience no
matter which device they use, is the main reason why they are becoming a trending so
quickly, and why it is predicted that by 2020, more or less 50% of the applications
available will be PWAs [12, 14].
Characteristics. To better understand what is involved when the talk is about PWAs,
in this subsection, it is presented a set of characteristics [15]:
• Progressive: this is a core feature that guarantees that no matter where the user is
located and which platform he is using the application will work;
• Responsive: PWAs are designed to fit any kind of device (phones, tablets, desktop,
and so on);
• Application feel: due to its design, type of interactions, and navigation, it passes to
the users the feeling that they are using an application;
• Independent of connectivity: due to the existence of service workers, the appli-
cation can work even without network connection (after the first time);
• Installation: they can be kept in the home screen of any device without needing to
download it from an application store;
• Discoverable: those are recognized as applications, and the service worker, along
with W3C allows them to be easily found;
• Engageable: this is related to the fact that the application lives in users’ home
screens, and include handling with native functions, such as pushing notifications;
• Safe: due to the fact that they are served via HTTPS, ensuring the need for
authentication to access it;
• Fresh: always the latest version due to the service worker update process.
Comparison. So, presented the PWA concept along with its main characteristics and
benefits, it is plausible to compare it with native applications, and standard web
applications, since it aims to merge them towards a better solution leveraging cross-
platform technology.
Before going deeper on this topic, it is important to clarify what is a cross-platform
app, which, in a simple way, is an app that has as main goal support multiple platforms
using just one code-base. This can be divided into two paradigms: runtime environ-
ments (the implication of providing a native app for each supported platform) and
generative approaches (implying the app generation over a unique code-base) [16].
Hereupon, in Table 3, it is possible to consult the comparison between the different
types of applications taking a set of seven relevant parameters: installation, updates,
size, offline access, user experience, push notification and discoverability.
Table 3. Comparison of the different types of applications (adapted from [15]).

Parameter Native App PWA Standard
Web App
Installation Download from App Click a button to add to the Not
Store or Play store. home screen Required
Updates Need to submit to the Instantly
store and the be
downloaded by the
user
Size Mostly heavy and can Small and Fast
take time to download
in users’ devices
Offline Access Available It needs to be used once online, Not
from there it is able to use Available
cached information
User Excellent if well Confusing due to double menus (app and
Experience designed browser menu)
Push Yes Yes if
Notification using
third-party
services
Discoverability Not good, require hard Good, to appear in search Not
work over the app results, need to be optimized to Required
store, and/or play store SEO (Search Engine
Optimization)
4.2 Status Quo

Hereupon, might be relevant to make a state point of how are PWAs being accepted by
reputed companies, and so on. To do so, in this section it is presented an adoption
status quo on PWAs, taking in its base a set of reputed companies.
Back in 2017, in Googles’ developer conference, Google I/O, PWAs were intro-
duced and discussed in areas going from user experience, technical frameworks, per-
formance testing, and migration. They are being pushed to be the new era of what is the
user experience of, both mobile and web worlds. Therefore, it started being imple-
mented in several reputed companies, such as Forbes, Financial Times, Lyft, Expedia,
AliExpress, Tinder, Flipkart, Housing.com, Twitter, and OLA. As it is possible to infer,
there is extensive coverage of this approach through very different niches and markets
denoting that PWAs can fit no matter the context [16].
As an example of why such companies have been satisfied with the adoption of this
approach and underlying the stated benefits for PWA adoption, here are considered two
particular cases: Twitter (social network used worldwide), and OLA (India’s largest
ride-hailing app). Both applications have seen its size abruptly reduced when compared
with their native versions (Table 4).
Table 4. Size applications comparison.

Application Android IOS PWA
Twitter 23 MiB 100 MiB 0.6 MiB
OLA 60 MiB 100 MiB 0.2 MiB
Regarding OLA, another question that was deeply affected by PWAs adoption is
related to the fact that they do not need an internet connection to work properly. This is
important to them, due to the fact that they operate over three distinguishable areas in
what concerns internet connection (the first tier is considered the one with the best
connection, and the third one is associated with low-connectivity). Well, after imple-
menting PWAs, they saw a 68% usage growth in the third tier, and the conversion rate,
in this tier, increased 30%. Regarding the other tiers, the values were similar to the
native applications [16].
4.3 SWOT Analysis

Thus, exist a set of advantages associated with this approach that turn it so appealing.
To clarify it, a SWOT analysis was carried on, according to some of the advantages and
disadvantages presented by [11–13, 17], and some other characteristics identified.
Strengths
• Quick Installs: the installation of this sort of application is simplified to the act of
accepting it when a prompt is displayed while opening the app on the browser;
• Freshest Version: there is no need for update requests, it happens instantly
whenever the user accesses the site;
• Progressive: this is related to the fact that no matter which device the user is using,
he will see the best version of the application that his device can handle;
• Leaner App Size: when compared to the native application they are much smaller,
using less disk space, and, therefore, being more efficient;
• Better and Faster Performance: this type of applications is optimized for speed
and content delivery on demand, according to web orientation. Furthermore, the
caching system on the client-side, allowing offline interaction puts PWAs ahead
when compared to both, native applications, and web applications;
• Increased Conversion Rates: this means the increasing number of subscriptions,
bookings, engagement, retention rate, loyalty rate, and so on. This is due to the fact
that PWAs improve the user experience (faster page loads, easier installs, instant
updates, smaller websites), and that is highly converted profit.
Weaknesses
• IOS Implementation: associated with this type of application is related to the fact
that the accessibility to native functions when using IOS is quite limited;
• Battery Usage: once PWAs are based on high-level programming languages, this
means an increase in CPU usage, which leads to more battery usage;
• Mobile Technical Requirements: specific technical requirements, like fingerprint
authentication, that is available only in native platforms (not in the web).
Opportunities
• Everything is Discoverable, Sharable, Linkable, and Rankable: by this, it is
supposed to understand that users can easily share links between each other helping
the growth of the users’ base along with making thing more convenient for the
users;
• App Stores are optional: since the download is made directly from the browser all
the bureaucracy associated with uploading the application to an app store, and
update it;
• Reduced Development Costs: both resource and time are reduced due to the fact
that just on the application is being developed, and demands posterior maintenance,
instead of four (web, Android, IOS, and Windows);
• Better User Adoption: this is given to the fact that PWAs are much easier to be
installed, and provide the user the possibility of testing the application (on the
browser) before installing it.
Threats
• New Approaches: the appearance of new approaches that might solve, nowadays,
PWAs problems, such as, limitations related to IOS, and some technical
requirements;
• Legacy Systems: this means that some older systems, and browsers, less used
nowadays, do not deal as well as they should with this new approach, which can
conduct to the abandon of some users;
• Accessibility: once again, associated with features that javascript does not allow to
access, reducing the possibilities of connection with devices, such as sensors, so
used in the Internet of Things emerging world.
5 Conclusions
This paper served the purpose of contextualizing the problem in hands, the con-
cretization of Pervasive Information Systems along with its problems, and present a
recent approach, Progressive Web Applications, that aim to solve, at least regarding
some cases, those same problems.
To do so, a substantiated explanation of what are Pervasive Information Systems
was provided, turning clear the coverage, complexity, and its imminence in our lives.
With that, the problematics involved while trying to implement such a system are
exposed.
Therefore, towards the concretization of what it is the Pervasive Information
System, it is possible to see, through this paper, that Progressive Web Apps come to
solve its questions, at least regarding the most imminent problem associated to this
matter, experience uniformization across different devices used by the common users.
This means that the PWA approach is not specialized to solve all sort of problems that
come along with PIS, but it is boosting the possibilities around user experience
regardless of users’ environment conditions.
In a nutshell, this article aims to be an easy explanation on how PWAs appear as a
concretization of the premisses involving what is a PIS, after explaining the necessary
concepts to fully understand what is the real need in hands, along with a proper
solution.
So, by the end of this paper, it is expected that the goal of alerting the community
for a new way of development, that allows its acceleration, along with solving different
problems in what concerns the PIS implementation, by solving just one application.
Moreover, the main contribute associated with this article is the SWOT analysis over
PWAs, that can be seen as a start point to whoever wants to develop different and
transverse solutions.
In a future article, it is expected to take this matter into another level, giving it a
stronger background, allowing this question to be deeper explored, along with an easy
to follow guide towards a simple PWA development. By other means, an extended
version of the present paper is going to be produced with the goal of providing some
guidelines to develop a PWA.
Acknowledgements. The work has been supported by FCT – Fundação para a Ciência e a
Tecnologia within the Project Scope: UID/CEC/00319/2019.
References
1. Simmhan, Y., Perera, S.: Big data analytics platforms for real-time applications in IoT. In:
Big Data Analytics: Methods and Applications, India. Springer, Heidelberg (2016)
2. Columbus, L.: https://www.forbes.com/sites/louiscolumbus/2016/11/27/roundup-of-internet-
of-things-forecasts-and-market-estimates-2016/#130f4040292d, 27 November 2016
3. Majchrzak, T.A., Schulte, M.: Context-dependent testing of applications for mobile devices.
J. Web Technol. 2(1), 27–39 (2015)
4. Gronli, T.-M., Biorn-Hansen, A., Majchrzak, T.A.: Software development for mobile
computing, the internet of things and wearable devices: inspecting the past to understand the
future. In: Proceedings of the 52nd HICSS Hawaii (2019)
5. Kourouthanassis, P.E.: A Design Theory for Pervasive Information Systems, pp. 1–2 (2006)
6. Stair, R., Reynolds, G.: Principles of Information Systems. Cengage Learning, Boston
(2014)
7. Kurkovsky, S.A.: Pervasive computing: past, present and future, January 2008
8. Kourouthanassis, P.E., Giaglis, G.M.: Pervasive Information Systems. M.E. Sharpe, New
York (2008)
9. Fernandes, J.E., Machado, R.J., Carvalho, J.Á.: Model-Driven Methodologies for Pervasive
Information Systems Development, 9 May 2004
10. Henricksen, K., Indulska, J., Rakotonirainy, A.: Modeling Context Information in Pervasive
Computing Systems, 21 August 2002
11. Upplication: Progressive Web Applications (2018). www.upplication.com
12. Ionic. The Architect’s Guide to PWAs. Whitepaper (2018)
13. Walker, H.: 10 reasons why you should consider Progressive Web apps (2018)
14. Wong, J.: Gartner Blog Network, Gartner, 24 March 2017. https://blogs.gartner.com/jason-
wong/pwas-will-impact-your-mobile-app-strategy/. Accessed 4 Nov 2019
15. Tandel, S.S., Jamadar, A.: Impact of progressive web apps on web app development.
IJIRSET 7(9), 9439–9444 (2018)
16. Gronli, T.-M., Biorn-Hansen, A., Majchrzak, T.A.: Progressive web apps: the definite
approach to cross-platform development? In: Proceedings of the 51st Hawaii International
Conference on System Sciences (2018)
17. Warcholinski, M.: What Are the Advantages and Disadvantages of Progressive Web Apps?
Brainhub https://brainhub.eu/blog/advantages-disadvantages-progressive-web-apps/. Acces-
sed 17 Nov 2019
Inclusive Education through ICT
Young People Participation in the Digital
Society: A Case Study in Brazil
Everton Knihs1 and Alicia García-Holgado2(&)

1
Universidade Presbiteriana Mackenzie, São Paulo, Brazil
[email protected]
2
GRIAL Research Group, Computer Sciences Department,
Research Institute for Educational Sciences, University of Salamanca,
Salamanca, Spain
[email protected]
Abstract. Young people are key drivers of new behaviors and understandings.
Their participation in the society allows the integration of their ideas and con-
structive analysis to foster policies and innovative solutions in which technology
are an intrinsic element. Citizen science can be used to give voice to children
and young people through the development of citizenship and approach to
scientific debate. In the European context, the WYRED project has developed a
methodological framework to support the participation of young people in the
Digital Society through social dialogues and the support of a technological
ecosystem to allow internationalization. This work aims to lay the groundwork
to transfer the WYRED framework to the Brazilian context through a case study
conducted in the Universidade Presbiteriana Mackenzie. The study has allowed
to identify the key topics that concern Brazilian young people in relation to
desired social change: tolerance to different cultures/opinions; mental wellbeing;
necessary changes in education (e.g. future-oriented education); self-image, self-
confidence; and Internet safety & privacy.
Keywords: Digital Society Young people Digital participation

Citizenship Citizen science
1 Introduction
The participation of young people in society through digital resources is increasingly

based on the intense interaction and discussion of knowledge about the future, a
reflection and interests that young people define as priorities for life in society. The
need to give young people a voice at a time when communication focuses on the
constructive debate enhanced by the concern with cultural and scientific development
and changes, listening to these young people becomes an important dimension for
citizen science and a critical analysis of the present, with current human actions
influencing everyone’s future life.
Communication can be considered one of the central factors in the construction of
citizenship [1]. Nowadays, in a contemporary moment, communication by digital
networks redesigns a new form of sociability between individuals, with interaction and
348 E. Knihs and A. García-Holgado
continuous reinvention of the established mode of organization in society [2]. The

participation of young people in the society in order to integrate their ideas and con-
structive analysis can be an innovative way of thinking together about a future, placing
these young people as protagonists in the construction of the collective future.
Citizenship can be exercised in many ways in modern society, reconfigured by
values indicated for the construction of social behavior, having as an excellent refer-
ence self-reflection [3].
Social transformation, through youth movements, can lead to significant changes
that alter the activities and the way of viewing what is structured, the important social
issues, especially because society is increasingly technological and connected.
Social projects and debates emerge as a way of realizing ideas and help in the
growing process of reflection on the important actions of young people who care about
society. The development of citizenship and approach to scientific debate are strong
characteristics of citizen science, which contributes to knowledge, environmental
concern, from the involvement of citizens [4].
It is important to encourage science and open innovation in young people to create
cycles of transfer [5–7] and co-creation of knowledge [8, 9] between research-oriented
institutions and the productive fabric [10].
In the European context, the WYRED project has developed a methodological
framework to support the participation of young people in the Digital Society through
the social dialogues focused on the development of research projects based on the
concept of citizen science [11–13]. The project was coordinated by the GRIAL Research
Group of the University of Salamanca [14, 15] and had the participation of partners from
Austria, Belgium, Israel, Italy, Turkey, and the United Kingdom. Although the project
finished in October 2019, the team continues applying and improving the framework
through the WYRED Association (https://association.wyredproject.eu) (Table 1).
Table 1. WYRED project details

Title netWorked Youth Research for Empowerment in the Digital Society
Acronym WYRED
Funding entity European Union
Call Horizon 2020. Europe in a changing world – inclusive, innovative and
reflective societies (HORIZON 2020: REV-INEQUAL-10-2016: Multi-
stakeholder platform for enhancing youth digital opportunities)
Reference 727066
Project leader Francisco José García-Peñalvo
Coordinator University of Salamanca (Spain)
Partners Oxfam Italia (Italy)
PYE Global (United Kingdom)
Asist Ogretim Kurumlari A.S. - Doga Schools (Turkey)
Early Years (United Kingdom)
Youth for exchange and understanding international (Belgium)
MOVES - Zentrum für Gender und Diversität (Austria)
Boundaries Observatory CIC (United Kingdom)
Tel Aviv University (Israel)
(continued)
Young People Participation in the Digital Society 349
Title netWorked Youth Research for Empowerment in the Digital Society
Budget 993.662,50€
Start date 01/11/2016
End date 31/10/2019
Web https://wyredproject.eu
The nine partners have guided over 1500 children and young people between 7 and
30 years old over three years to ask questions and carry out researches about themes
and ideas that affect and shape their interactive, performative and communicative
worlds. The framework is composed by the methodology and the WYRED Ecosystem
[16–18], a technological ecosystem to facilitate the interaction not only between young
people from the participant countries but also the interaction of children and young
people with stakeholders and decision-makers. The ecosystem works as a catalyst to
give voice to children and young people, so their ideas and projects can have an impact
on the decision-making processes related to the Digital Society.
The WYRED Project was applied only in the participant countries, although it is
possible to apply the framework in other countries due to the methodology and the
main software components of the WYRED Ecosystem are available in several lan-
guages (English, Spanish, Hebrew, Italian, and Turkish).
In this context, this work aims to lay the groundwork to transfer the WYRED
framework to the Brazilian context. The first phase of the WYRED methodology is a
set of social dialogues among children and young people. It is necessary to identify the
key themes that concern children and young people in relation to the desired social
change in order to initiate the dialogue process. For this reason, the case study
described in this work is focused on identifying the key themes for Brazilian youth and
conduct a set of dialogues, both face-to-face and through the WYRED Ecosystem.
The case study was conducted in the Universidade Presbiteriana Mackenzie (Sao
Paulo, Brazil) in the bachelor’s degree in Information Systems and Computer Science
courses as an activity inside the discipline of Science, Technology, and Society in
Mathematics and Computing. Four different groups were involved, with a total of 95
students between 18 and 33 years old.
The rest of the paper is organized as follows. Section 2 describes the methodology
used to conduct the study. Section 3 presents the case study. Section 4 describe the
main results of the survey. Finally, Sect. 5 discusses the results and summarizes the
main conclusions of this work.
2 Methodology
Regarding the methodology used in this work, the beginning occurred through a
process of reflection and importance that young people may have on certain subjects;
about what they are concerned about some important themes in the Digital Society. The
reading, analysis, and discussion among the authors on the subject led to the research of
the theoretical foundation, literature review, together to seek a database of information

and data that made it possible to contextualize the study.
The WYRED methodology is composed of four phases: preparation, dialogue,
research, and evaluation. The case study is focused on the first two phases. First, in the
preparation phase, stakeholders, children, and young people express their views on the
most important issues that concern young people in the Digital Society. On the other
hand, in the second phase, children and young people engage with each other in
dialogues to define the exact questions that they would like to focus on [19]. The
dialogues can be conducted face-to-face (local dialogues), fully online (international
dialogues), or a blended version in which the dialogue starts in person and after online
to interact with people from other places.
Both phases were adapted to be applied in a Brazilian higher education institution
as a participatory activity in order to obtain the necessary data to form the sample set to
establish a clear picture of the current situation regarding the importance of youth
participation and their opinion.
First, some key topics from a survey previously applied in Spain, Italy, Belgium,
Austria, United Kingdom, Israel, and Turkey were sought to draw. Then, it was pro-
posed to reflect on the main opinions and themes chosen by the young, consolidating
the importance of each theme. Thus, the steps developed in this study were:
• Application of a survey based on the instrument applied in the WYRED project to
identify the key topics for young people in the Digital Society. The WYRED project
carried out a Delphi study to identify and prioritize key areas of interest for young
persons. According to the Delphi technique, the survey involves two rounds to
achieve a greater consensus on a topic. In this case study, the survey was used as an
exploratory instrument to get an overview of the key topics in the Brazilian context.
• Each class was divided into groups according to the selected key topics.
• Identification of the main topics chosen by young people.
• Preparation of the WYRED Platform, one of the software components of the
WYRED Ecosystem, to support the dialogues among the students’ groups focused
on the same topic and in different classes. The WYRED Platform is based on
communities, so one community was created for each selected topic.
• Study on the chosen topic and participation in the community previously indicated
for integration, exposition of opinion and interaction among young people who
share the same thematic concern, through the forums provided in the communities
inside the WYRED Platform.
3 Case Study
The case study was based on young people’s perception of the main topics of interest in
the Digital Society. The goal is to give young people a voice so that their opinions are
taken into account when making technology-related decisions.
3.1 Approach to the Digital Society in the Discipline of Science,

Technology, and Society in Mathematics and Computing
The subject Science, Technology, and Society in Mathematics and Computing is
offered in the first semester of the undergraduate course, is a compulsory part of the
pedagogical project of the Information Systems and Computer Science courses of a
private university located in the city of São Paulo, Brazil. This activity was carried out
in four classes, two Information Systems and two Computer Science classes, which
were integrated into the discussions of the topics chosen for this case study using the
WYRED Platform.
This course consists of theoretical classes with two face-to-face hours per week in
total. Theoretical classes provide the basis for group discussions or presentations of
individual assignments. The subject of this course covers the following topics: science,
history and philosophical conception, science and the modern world, history of tech-
nological development, contemporary aspects in technology, the concept of society,
culture and social aspects, society and technological development, the social dimension
and human and technological aspects, technology and work.
The discipline of Science, Technology, and Society in Mathematics and Computing
is focused on the study of the development of science and technology, their interfaces
with society, and their reciprocal influences on mathematics and computing. In par-
ticular, the study on the epistemological foundations of science and technology; the
reflection on non-neutrality in science; the analysis of scientific facts conditioned to
their social context of origin and development; and the study of how science’s dis-
coveries and their technological applications interrelate with the human social
dimension and in the context of the man-machine relationship.
The student must be able to present the fundamental structures and concepts of
science, technology, and society. Understand the discoveries related aspects of science
and technological development. Know the history of science and the cultural, tech-
nological, and social aspects related to information technology professionals, together
with the technological development aspects.
The evaluation criterion of this discipline is composed of individual and group
writing work, with a formal presentation.
3.2 Key Topics for Young People

A quantitative survey based on the instrument defined in the Delphi study of the
WYRED project was adapted to the Brazilian context, not only translated but also some
questions were adapted. The purpose of this survey is to quantify which topics are
considered most important to young people about the Digital Society. The first survey
applied in the WYRED Delphi asked about 12 items and left an open field for young
people to indicate the topics they considered important. In the second survey that was
applied, a list was created with 15 items based on those identified during the first round;
these are the items used in the instrument adapted to the Brazilian context. The survey
consists of nine questions, as presented in Table 2, which were applied in Portuguese
using Google Forms. To distinct groups, each class was identified with an ID: 1G, 1J,
1N, 1X.
Table 2. Survey based on [20, 21].

Q1. What are the issues related to young people that you consider most important and
you think that our project should deal with? (Likert scale: 1-Not important, 2-Slightly
important, 3-Moderately important, 4-Important, 5-Very important)
1. Self-image, self-confidence 9. Integration of migrants/refugees in schools and in
2. Tolerance to different the society
cultures/opinions 10. Adult misunderstandings of young people
3. Necessary changes in education 11. Reliability of information on the Internet and social
4. Causes of stress among young media
people 12. Roles of parents, friends and peer groups
5. Employment prospects 13. Environmental problems (e.g. pollution)
6. Cyber-bullying, shaming 14. Crime
7. Internet safety & privacy 15. Mental wellbeing
8. Gender
stereotypes/discrimination
Q2. Gender: Female; Male; Not mentioned above; No answer
Q3. Which is your year of birth?
Q4. What country were you born in?
Q5. Color skin/race: White; Black; Pardo; Asian; Indigenous
Q6. Who is the person who brings the most income to the family?
(You; Your father; Your mother; Other relative; Other person)
Q7. What is the highest level of education attained by the person who contributes the
most income to the family unit?
No studies
Elementary School I
Elementary School II
High school
Technologist, Degree, Bachelor’s Degree
Postgraduate (Specialization, Master, Doctorate)
Don’t know/Don’t answer
Q8. What is the employment status of the person with the highest income in the
household?
(Employee/Employed person; Self-employed person/Self-employed person; Unemployed)
Q9. In which of the following groups does the person who contributes the most income to
the family unit work or worked?
Managers and directors
Specialists in intellectual and scientific activities
Technicians and mid-level professions
Administrative employees
Service and sales personnel
Skilled agricultural, forestry and fishing workers
Workers, craftsmen and similar workers
Plant and machinery operators and assemblers
Unskilled workers/Elementary occupations
Military professions
Other
3.3 Participants
Four different groups were involved, with a total of 95 students between 18 and 33
years old, 13 women (13.68%), 78 men (82.11%), and 4 others (4.21%). The survey
was answered by 88 students, 12 women (13.64%), 73 men (82.95%), 2 selected that
his gender is non-binary (2.27%), and 1 preferred not to answer (1.14%). Regarding the
skin color/race of the sample, 66 are white (75%), 14 are pardo (15.91%), 7 are Asian
(7.95%), and 1 is black (1.14%).
Through the answers obtained in the survey (Table 2), it was possible to divide the
students according to their work interests. The second phase was conducted with all the
population. The students established dialogues on the selected topics through a set of
communities in the WYRED Platform and during a face-to-face session in each class.
4 Results
As a first step in the analysis process, the descriptive statistics of the answers of the
students were calculated (Table 3). Furthermore, the results were calculated per student
group.
Table 3. Results of the descriptive analysis

N Min Max Mean sx
Self-image, self-confidence 88 1 5 4,28 0,83
Tolerance to different cultures/opinions 86 1 5 4,55 0,762
Necessary changes in education 87 3 5 4,43 0,676
Causes of stress among young people 87 1 5 3,91 1,052
Employment prospects 87 2 5 4,1 0,836
Cyber-bullying, shaming 87 1 5 3,92 1,154
Internet safety & privacy 88 1 5 4,15 1
Gender stereotypes/discrimination 87 1 5 4,11 1,104
Integration of migrants/refugees in schools and in the society 87 1 5 3,74 1,253
Adult misunderstandings of young people 86 1 5 3,14 1,031
Reliability of information on the Internet and social media 87 1 5 4,08 1,059
Roles of parents, friends and peer groups 88 1 5 3,98 1,039
Environmental problems (e.g. pollution) 87 2 5 4,08 1,037
Crime 87 1 5 4,09 0,923
Mental wellbeing 87 1 5 4,66 0,679
The results indicate five topics that were of most significant interest, namely:
• Mental wellbeing.
• Tolerance to different cultures/opinions.
• Necessary changes in education.
• Self-image, self-confidence.
• Internet safety & privacy.
There are some differences among the most valued topics regarding the results per
group. In the class 1G, “Crime” and “Environmental problems” appears among the
five most valued topics. Regarding the class 1J, appears “Gender stereotypes/
discrimination” instead “Internet safety & privacy.” In the class 1N, “Reliability of
information on the Internet and social media” instead “Self-image, self-confidence.”
Finally, in the class 1X, emerges the “Roles of parents, friends and peer groups” instead
“Internet safety & privacy.”
5 Discussion and Conclusions
The need for youth participation in discussion and analysis promotes a con-temporary
format of transformative involvement in addressing key issues of the Digital Society.
Including young people in the process of transforming skills and discussing creative
initiatives into knowledge and skills can foster protagonism and drive positive con-
clusions for the future. This paper aims to promote youth protagonism, enable dis-
cussion and interaction on relevant aspects in the Digital Society and present the
importance of implementing and consolidating participatory and analytical activities
directed at young people, increasing the importance of their opinions and the debates
they participate.
The case study transfers an experience based on the results of a project funding by
the European Union, to the Brazilian context. In particular, it adapted the WYRED
methodological framework in order to identify the main topics that concern Brazilian
young people concerning desired social change.
Regarding the most important topics rated by the young people in Brazil, the results
are similar to those obtained in Europe (Austria, Belgium, Israel, Italy, Spain, Turkey,
United Kingdom). In the European survey, 355 children and young people answer the
survey, although 632 respondents submitted answers to part of the questions, namely
(in most cases) full answers to the topics rating [21]. The European sample was
composed of young people between 15 and 30 years old, 48.7% women and 51.3%
men. The most valued topics in Europe were: necessary changes in education; tolerance
to different cultures and opinions; mental wellbeing; self-image, self-confidence; and
gender stereotypes/discrimination.
Even though the samples are different according to the size and gender balance, the
results show a high degree of similarity. Four of the five topics are the same in Brazil
and Europe. In Brazil, “Internet safety & privacy” was better rated, while in Europe,
there is a particular interest in “gender stereotypes/discrimination”.
It is also important to highlight that the topics were better rated in Brazil than in
Europe. However, this difference may be related to the survey in Brazil was applied in
the same socio-economic and cultural context; meanwhile, in Europe was applied in a
heterogeneous context in different countries and regions.
Acknowledgments. With the support of the EU Horizon 2020 Programme in its “Europe in a
changing world – inclusive, innovative and reflective Societies (HORIZON 2020: REV-
INEQUAL-10-2016: Multi-stakeholder Platform for enhancing youth digital opportunities)”
Call. Project WYRED (netWorked Youth Research for Empowerment in the Digital society)
(Grant agreement No. 727066). The sole responsibility for the content of this webpage lies with
the authors. It does not necessarily reflect the opinion of the European Union. The European
Commission is not responsible for any use that may be made of the information contained
therein.
References
1. Kunsch, M.M.K., Kunsch, W.L.: Relações Públicas Comunitárias: A comunicação numa
perspectiva dialógica e transformadora. Summus, São Paulo, Brazil (2007)
2. Felice, M.: As formas digitais do social e os novos dinamismos da sociedade contemporânea.
Relações Públicas Comunitárias: A comunicação numa perspectiva dialógica e transfor-
madora. Summus, São Paulo, Brazil (2007)
3. Santos, M.E.V.M.: Cidadania, conhecimento, ciência e educação CTS. Rumo a “novas”
dimensões epistemológicas. Revista Iberoamericana de Ciencia, Tecnología y Sociedad 6,
137–157 (2005)
4. Mamede, S., Benites, M., Alho, C.J.R.: Ciência Cidadã e sua Contribuição na Proteção e
Conservação da Biodiversidade na Reserva da Biosfera do Pantanal. Revbea, São Paulo, V.
Revista Brasileira de Educação Ambiental (RevBEA) 12, 153–164 (2017)
5. Bueno Campos, E., Casani, F.: La tercera misión de la Universidad. Enfoques e indicadores
básicos para su evaluación. Econ. Ind. 366, 43–59 (2007)
6. García-Peñalvo, F.J.: La tercera misión. Educ. Knowl. Soc. 17, 7–18 (2016)
7. Vilalta, J.M.: La tercera misión universitaria. Innovación y transferencia de conocimientos
en las universidades españolas. Studia XXI. Fundación Europea Sociedad y Educación,
Madrid (2013)
8. García-Peñalvo, F.J., Conde, M.Á., Johnson, M., Alier, M.: Knowledge co-creation process
based on informal learning competences tagging and recognition. Int. J. Hum. Cap. Inf.
Technol. Prof. (IJHCITP) 4, 18–30 (2013)
9. Ramírez-Montoya, M.S., García-Peñalvo, F.J.: Co-creation and open innovation: systematic
literature review. Comunicar 26, 9–18 (2018)
10. Etzkowitz, H., Leydesdorff, L.: Universities and the Global Knowledge Economy. A Triple
Helix of University-Industry-Government Relations. Pinter, London (1997)
11. García-Peñalvo, F.J., Kearney, N.A.: Networked youth research for empowerment in digital
society: the WYRED project. In: García-Peñalvo, F.J. (ed.) Proceedings of the Fourth
International Conference on Technological Ecosystems for Enhancing Multiculturality
(TEEM 2016), Salamanca, Spain, 2–4 November 2016, pp. 3–9. ACM, New York (2016)
12. García-Peñalvo, F.J.: WYRED project. Educ. Knowl. Soc. 18, 7–14 (2017)
13. García-Peñalvo, F.J., García-Holgado, A.: WYRED, a platform to give young people the
voice on the influence of technology in today’s society. A citizen science approach. In:
Villalba-Condori, K.O., García-Peñalvo, F.J., Lavonen, J., Zapata-Ros, M. (eds.) Proceed-
ings of the II Congreso Internacional de Tendencias e Innovación Educativa – CITIE 2018,
Arequipa, Perú, 26–30 November 2018, pp. 128–141. CEUR-WS.org, Aachen (2019)
14. Grupo GRIAL: Producción Científica del Grupo GRIAL de 2011 a 2019. Grupo GRIAL,
Universidad de Salamanca (2019)
15. GRIAL Group: GRIAL Research Group Scientific Production Report (2011–2017). Version
2.0. GRIAL Research Group, University of Salamanca (2018)
16. Durán-Escudero, J., García-Peñalvo, F.J., Therón-Sánchez, R.: An architectural proposal to
explore the data of a private community through visual analytic. In: Dodero, J.M., Ibarra
Sáiz, M.S., Ruiz Rube, I. (eds.) Proceedings of the 5th International Conference on
Technological Ecosystems for Enhancing Multiculturality (TEEM 2017), Cádiz, Spain, 18–
20 October 2017, Article 48. ACM, New York (2017)
17. García-Peñalvo, F.J., Vázquez-Ingelmo, A., García-Holgado, A.: Study of the usability of
the WYRED Ecosystem using heuristic evaluation. In: Zaphiris, P., Ioannou, A. (eds.)
Proceedings of 6th International Conference on Learning and Collaboration Technologies.
Designing Learning Experiences, LCT 2019, Held as Part of the 21st HCI International
Conference, HCII 2019, Orlando, FL, USA, 26–31 July 2019, Part I, pp. 50–63. Springer,
Cham (2019)
18. García-Peñalvo, F.J., Vázquez-Ingelmo, A., García-Holgado, A., Seoane-Pardo, A.M.:
Analyzing the usability of the WYRED Platform with undergraduate students to improve its
features. Univers. Access Inf. Soc. 18(3), 455–468 (2019)
19. WYRED Consortium: WYRED Research Cycle Infographic. WYRED Consortium (2017)
20. Hauptman, A., Soffer, T.: WYRED Delphi Study. Results Report (2017)
21. Hauptman, A., Kearney, N.A., Raban, Y., Soffer, T.: WYRED Second Delphi Study Results
Report (2018)
Blockchain Technology to Support Smart
Learning and Inclusion: Pre-service Teachers
and Software Developers Viewpoints
Solomon Sunday Oyelere1(&), Umar Bin Qushem1,

Vladimir Costas Jauregui2, Özgür Yaşar Akyar3, Łukasz Tomczyk4,
Gloria Sanchez5, Darwin Munoz5, and Regina Motz6
1
University of Eastern Finland, Joensuu, Finland
{solomon.oyelere,umarbin}@uef.fi
2
Universidad Mayor de San Simón, Cochabamba, Bolivia
[email protected]
3
Hacettepe University, Ankara, Turkey
[email protected]
4
Pedagogical University of Cracow, Kraków, Poland
[email protected]
5
Universidad Federico Henríquez y Carvajal,
Santo Domingo, Dominican Republic
{gsanchez,dmunoz}@ufhec.edu.do
6
Universidad de la República, Montevideo, Uruguay
[email protected]
Abstract. In support of an open ecosystem for lifelong and smart learning, this
study evaluates the perception of educational stakeholders such as pre-service
teachers and blockchain developers about the feasibility of the blockchain
technology in addressing the numerous gaps in the implementation of smart
learning environment. This research was designed within the international
project, Smart Ecosystem for Learning and Inclusion (SELI). A total of 491 pre-
service teachers and 3 blockchain developers from these countries participated
in the study. The study data was collected from a questionnaire and interview.
Descriptive statistics and content analysis was performed on the collected data.
Results from this study indicates that Blockchain technology in the educational
field is rarely known, and the frequency of use is quite low. The pre-service
teachers surveyed, for the most part, are unaware of the degree of effectiveness
of blockchain technology in education. Blockchain developers are of the opinion
that Blockchain is still new to many people and resources for education based
application are very rare, even if it is there, yet not many are open-source.
Keywords: Smart learning environment Open learning ecosystem Inclusion
1 Introduction
The application of blockchain technology in education is gaining popularity recently.

Many institutions across the globe have started initiatives to develop blockchain-based
solutions that will address pedagogical gaps [19]. Most contemporary blockchain
358 S. S. Oyelere et al.
solutions in education have focused on transcripts, badges, and records of achievement.

For example, Arenas & Fernandez [18] present a decentralized verification of academic
credentials based on blockchain. The blockchain stores compact data proofs of digital
educational credentials for easy verification. Similarly, Ocheja, Flanagan, & Ogata [19]
studied a blockchain based approach for connecting learning data across several
learning platforms, institutions and organizations. In fact the advancement of tech-
nology and current capabilities of digital devices have impelled another possibility to
the traditional educational records and transcripts, which comprise of additional
security and comprehensive information. However, some important solution of
blockchain are still unimplemented, such as the description of the skills attained,
competencies, soft skillsets, level of skill mastery, managing access rights, extra-
curricular activities that support learner’s development [13, 16, 19]. Furthermore,
students do not have direct access to their educational history and transcripts, but have
to rely on the institutions, which may not be the optimal solution for sharing the record.
In support of an open ecosystem for lifelong learning, and smart learning, this study
evaluates the perception of educational stakeholders such as teachers, pre-service
teachers and blockchain developers about the feasibility of the blockchain system in
addressing the numerous gaps in the implementation of smart learning technology. The
overall aim of this research is to present the preliminary findings about the application
of Blockchain technology in the educational settings from the point of view of pre-
service teachers and technical developers.
2 Literature Review
Nowadays, some universities and institutes have applied blockchain technology into
education, and most of them use it to support academic degree management and sum-
mative evaluation for learning outcomes [1]. There are many application and devel-
opment occurring in technical industry with the integration of blockchain technology
which aims to strengthen the effort of open ecosystem for learning by securing col-
laborative learning environment, protecting learning objects, identifying the necessary
technologies and tools, enhancing the students interactions with educational activities
and provides a pedagogical support for lifelong learning [2]. This chain block is thus a
transaction log or ledger (ledger) public, shared by all nodes in the network [16].
2.1 Technologies to Support Open Education - Existing Blockchain

Solution in Education
To begin with a full-scale system, it certainly addressed the MIT Media lab’s block-
chain education credentialing system ‘Blockcerts’ which is the only fully functional
blockchain based system that supported open education. The inception of the block-
certs started from an incubator project at MIT Media Lab in 2015 which operated its
functions with collaboration with Learning Machine, a Cambridge based Software
Company. The blockcerts is capable of delivering digital certificates of achievement
and served as a middleware in between issuance and retrieving certificates online which
Blockchain Technology to Support Smart Learning and Inclusion 359
eliminates the risk of falsified certifications available on the market by so many unli-
censed issuers [3]. In the aim of integration of blockcerts, MIT used blockchain wallet
which solves the problem of public and private keys in securing bitcoin blockchain
transactions though bitcoin network is getting bigger raising the question mark to a new
additional fee to the stakeholders. Based on the success of the blockcerts, the university
of Nicosia was the first higher education institute that adapted the distribution of
academic certificate through bitcoin blockchain [4, 5] alongside the Malta, as the first
European Country to follow the lead [6].
In the wave of new technological needs to the education system, decentralized
Autonomous credit system is an ideal approach to digitize the sector [7]. EduCTX, a
blockchain based higher education credit platform was invented to fill the gap, espe-
cially in Europe where the European Credit Transfer and Accumulation System
(ECTS) is used as a common academic credibility [7]. Turkanovic, Holbl, Kosic,
Hericko and Kamislac’s proposed global blockchain-based higher education credit
platform took the advantage of ARK [8], open-source blockchain platform in building a
unified, simplified and globally ubiquitous higher education credit and grading system
which supports various stakeholders including HEIs in their activities related to stu-
dents and organizations and provides a gateway of fraud detection and early prevention.
It also enables future employees’ possibilities to track the students’ academic
achievements in a transparent way through a peer-to-peer network and proof of work
[9, 10].
Building digital trust in cyberspace is a risk judgement among stakeholders.
Blockchain came along to prove a safe environment for many financial institutions, it is
now ready to bring the trust in the education sector as well such as validating processes
to documents including certificates, course assessment and evaluation of student
competencies. Bandara, Loras and Arraiza [11] argued quite nicely in their work and
proposed a blockchain-secured Digital Syllabus. This infrastructure reduces the inter-
dependency and allows Digital Syllabus to store on a public database (blockchain
network) through hash function before validating and producing a validated Syllabus
[11]. The overall process encourages more openness to our education and set a great
example to have an impact on our society.
3 Research Design and Context
This research was designed within the international project, Smart Ecosystem for
Learning and Inclusion (SELI) [15]. The main objective of this study was to investigate
the conditions related to the integration of Blockchain technology in ICT-supported
learning, teaching and educational inclusion. These goals are primarily diagnostic but
they will also enable comparative analyses of the selected European and Latin
American countries. While conducting the research among pre-service teachers, we
answer the following questions:
How often are blockchain used in the school environment and among the pre-
service teachers?
What is their subjective evaluation of the blockchain used to support learning,

teaching and digital inclusion?
What is the level of interest in new online trainings focused on the development of
blockchain in learning, teaching, development support and digital inclusion?
The research data was collected using the questionnaire and interview. The tech-
nique used was the diagnostic survey and the tool was an online or printed question-
naire. The research was conducted among pre-service teachers in the period May–
September 2019. The research covered the countries: Uruguay, Poland, Bolivia, Tur-
key. A total of 491 pre-service teachers and 3 blockchain developers from these
countries participated in the research.
4 Results
4.1 Perception of Pre-service Teachers to Blockchain in Education

The percentage of pre-service teachers who have never been used blockchain tech-
nology exceeds 75% in Uruguay, Poland, and Bolivia. Turkey is an exception, in which
about half of the respondents used this technology. The frequent and widespread use of
this technology is 11.3% for Turkey, meanwhile for Uruguay is 4.31% and, for Poland
is 4%. Occasional use of this technology for respondents in Turkey is 38%, 20.67% in
Poland and, 17% in Uruguay. Bolivia is a country with frequent use of less than 1%
and occasional use that barely reaches 7.8% (see Fig. 1). Blockchain technology in the
educational field is very little used, and the frequency of use is quite low. Except for
Turkey, that reaches 38% of occasional use, the other countries have occasional use
less than 20.67%, being Bolivia the country with fewer respondents using blockchain.
Fig. 1. Usage of the blockchain technology
The pre-service teachers surveyed, for the most part, are unaware of the degree of
effectiveness of blockchain technology in education. In Uruguay and Bolivia, lack of
knowledge of the degree of effectiveness is high; 75.86% in Uruguay and 77.92% in
Bolivia. In Poland, 40.67% of respondents and 45.1% of them in Turkey state that they
do not know the degree of effectiveness of this technology (see Fig. 2).
Fig. 2. Perception of how effective the blockchain solutions are in education
Considering only the pre-service teachers who have a perception of the degree of
effectiveness of blockchain in education, there is a tendency to evaluate it as acceptable
by approximately one-third of the respondents, meanwhile, Poland reaching almost half
of those respondents (47.19%) with acceptable approval of this technology. The high
percentage of respondents who consider it low effective are in Bolivia and Turkey; The
case of Turkey has a high percentage of respondents who used this technology (reaches
54.9%) and has a high percentage of disenchanted with the experience in its use.
Poland is a compelling case, since the group that has experience of use with this
technology, only 14.61% consider it ineffective (see Fig. 3).
Fig. 3. Pre-service teachers with perception about blockchain solutions in education
In general, one-third of the pre-service teachers surveyed do not know about

blockchain technology. Turkey is the country in which they know most about this
technology. In Bolivia, half of the respondents (51.3%) have an interest in learning
about blockchain for education, followed by Uruguay (44.02%), Poland (37.33%) and
Turkey (36.7%) (see Fig. 4). In the case of Uruguay and Turkey, neutrals (which are
presumed not to have an interest but could have one) are approximately one-fifth of the
respondents. 17.24% in the Uruguayan case and 22.5% in the Turkish case. Bolivia and
Poland have the lowest neutral rates.
Fig. 4. Interest in learning more about the blockchain
It is striking that the most considerable interest in learning about blockchain

technology is in Bolivia, being the country with the highest percentage of respondents
who never used it and also with the highest percentage of disenchantment with it. It is
likely that the novelty as applied technology in education has caught his attention and
aroused an interest in knowing about it.
4.2 Perception of Blockchain Developers - Findings from Interview

with the Developers
As part of this study, interviews with the developers was one of the key learning
features. Technical development and backend information of any system is something
considered hidden in many of our current time’s project. However, as the project of
SELI is adding a new platform for smart learning, sharing the development experience
from our core developers in order to amplify the real concerns and opportunities might
be useful for people and who would like to get in this route especially Blockchain
based technology in the near future. As a result, our developers namely Andres Heredia
and Mateo Mejia from Ecuador, as well as Alvaro Yapu Cossio from Bolivia shared the
view with us.
Throughout the interview, a total of 12 questions has been asked to each developers
in regards to Blockchain technology and its relevance for Education. The discussion
was open-ended and expressive. One of the highlighted points is that most of the
developers agreed that Blockchain is still new to many people and their resources for
education based application are very rare, even if it is there yet not many are open-
sources. However they all arguably expressed that it is getting popularity in many
sectors including education. Developer Andres pinpointed the fact in the field of
secured certificates is not very developed and few universities and educational institutes
tried to implement it using blockchain. In the questions what blockchain can bring to
the education sector, they iterated the privacy and security concern. In using such
technology, it allows in-depth verification without having to be dependent on third-
parties and the data structure in a blockchain technology is append-only. Thus, data
cannot be altered or deleted so easily. In addition, it establishes a token of education,
creates a coin that has no value, instead an educational value that can be used to
subscribe to new courses or to receive job offers.
In order to observe their experiences so far in developing blockchain for SELI
project, we wanted to ask developers the understanding and the needs of this Project
Model to them. Developers describe it as challenging but day by day it becomes clear
and structure was formed. Moreover, it is an undeniable fact that each developer has
different approaches to the development environment regardless of the project and
process, hence Mateo used SCRUM, while Andres went for a framework called
‘NextJS’ to deploy the client part in blockchain and go-Ethereum to implement the
network through RPC and Web3 (Library of NextJS). Furthermore, developers
expressed that a few key things to keep in mind if one wants to be a blockchain
developer which are knowledge of how blockchain works, programing languages that
allow to deploy a blockchain network like go-Ethereum or solidarity for contracts and
some advance knowledge on nodes, security, the logic of smart contracts. Not to
mention, developers shared mixed experiences such as Mateo whom it was simple as
he focused on connectivity of the platform with web services like REST. On the other
hand, Alvaro highlighted some challenges especially communication and unstructured
methodology of building the application with available technologies which needs
experimentation before execution. In overall, developers said working in the SELI
project provides a new experience and skills which can be used to build more edu-
cational application in the future. Moreover, the SELI system brought us a very good
alternative to other platforms like Moodle since it involves the use of blockchain for the
issuance of certificates which was not available on any other platform by the date.
5 Discussion
Although there is an increasing amount of study exist about blockchain use in edu-
cation, there is a lack of empirical study which gathers data from actual target users
regarding use of blockchain. In this study we collect data and investigate findings from
target users as well as developers in order to reveal end user perceptions about
blockchain technology. The very first research question aimed to explore how often is
blockchain used in the school environment and among the students of teaching degrees.
According to obtained results although pre-service teachers in Turkey is an exception,
in which about half of the respondents used this technology, pre-service teachers who
have never been used blockchain technology exceeds 75% in Uruguay, Poland, and
Bolivia. This shows that although many institutions across the globe have started
initiatives to develop blockchain-based solutions that will address pedagogical gaps
Grech & Camilleri [17] very few percentage of potential users have actually used
blockchain in educational settings.
The second research question aimed to explore pre-service teachers’ subjective
evaluation of the blockchain used to support learning, teaching and digital inclusion.
Findings show that the pre-service teachers surveyed, for the most part, are unaware of
the degree of effectiveness of blockchain technology in education. In Uruguay and
Bolivia, lack of knowledge of the degree of effectiveness is high; 75.86% in Uruguay
and 77.92% in Bolivia. In Poland, 40.67% of respondents and, 45.1% of them in
Turkey state that they do not know the degree of effectiveness of this technology.
Considering research project carried out by researchers from the University of New
England where one of the key identified problems in education was lack of pedagogical
responses to the needs of the students [12] it is important to investigate pre-service
teachers understanding about the technology in relation to pedagogy. Because today’s
pre-service teachers are the first generation who can actually use it in their future
classes as the technology is mostly limited with university use cases. Unfortunately
study reveals only a few preservice teachers perceive blockchain as useful despite
many application and development occurred in technical industry with the integration
of blockchain technology which aims to strengthen the effort of open ecosystem for
learning by securing collaborative learning environment, protecting learning objects,
identifying the necessary technologies and tools, enhancing the students interactions
with educational activities and provides a pedagogical support for lifelong learning [2].
Perhaps when blockchain use cases such as the case of the blockcerts from the uni-
versity of Nicosia which was the first higher education institute that adapted the dis-
tribution of academic certificate through bitcoin blockchain [4, 5] and Malta, as the first
European Country to follow the lead [6] are increased in universities, pre-service
teacher’ awareness about the use of blockchain in education may increase. Their
awareness can be even enlarged with a blockchain based higher education credit
platform such as the European Credit Transfer and Accumulation System (ECTS) is
used as a common academic credibility [8] or similarly, blockchain based approach for
connecting learning data across several learning platforms, institutions and organiza-
tions as studied by [20]. In this way blockchain may have inevitable influence in
teacher’s carrier as it enables future employees’ possibilities to track the students’
academic achievements in a transparent way through a peer-to-peer network and proof
of work [10].
The third research question aimed to explore interest in new online trainings
focused on the development of blockchain in learning, teaching, development support
and digital inclusion. According to the results, half of the respondents (51.3%) in
Bolivia showed interest in learning about blockchain for education, followed by
Uruguay (44.02%), Poland (37.33%) and Turkey (36.7%). Blockchain Technology
considered as a potential technology to support pedagogy of professional education like
nursing and health care through this decentralized academic degree management and
secured evaluation tools for learning outcomes [1, 14]. However we can say that still
the users are not ready to accept technology. Although some universities and institutes
have applied blockchain technology into education, and most of them use it to support
academic degree management and summative evaluation for learning outcomes [1] our
findings reveals that there is a relatively low level of blockchain technology.
6 Conclusion and Guidelines for the Stakeholders
The blockchain technology has created a new paradigm in the information society.
More and more applications are made every day, including the education sector. The
use of blockchain in education presents a great opportunity to increase the agility and
transparency in the academic process. However, the use of this technology for edu-
cation is on an incipient stage, especially in Latin American countries. This situation is
an excellent opportunity to revolutionize the way of how education services are con-
ceived, in terms of academic information system, recording academic achievements,
security of information, collaborative learning environment, learning management
system and contributing to the reliability of online education. Analysis of blockchain
use, in preservice teachers in three of the four studied countries, shows that the use of
blockchain is very low. Findings suggest that they are not aware of the effectiveness of
the technology, nevertheless, more of the third part of the respondents of all the
countries represented are interested in acquiring competencies in the new technology.
This information brings the chance to state as a starting point the following
recommendations:
(i) Promote the inclusion of blockchain technology in the different aspect of edu-
cation sector. (ii) Develop a capacity building plan for the teachers to use the available
technology, as a SELI Platform, to improve the education experiences. (iii) Promote
and establish synergies between the regulation institutions and private institutions that
provide education services to promote the implementation of blockchain technology.
(iv) Create a showcase environment of the possible uses of blockchain technology in
education. (v) Promote a legal framework for support and enable the use of blockchain
technology in the academic process.
Acknowledgement. This work was supported by the ERANET-LAC project which has
received funding from the European Union’s Seventh Framework Programme. Project Smart
Ecosystem for Learning and Inclusion, ERANet17/ICT-0076SELI.
References
1. Sharples, M., Domingue, J.: The blockchain and kudos: a distributed system for educational
record, reputation and reward. In: European Conference on Technology Enhanced Learning,
pp. 490–496. Springer, Cham (2016)
2. Alammary, A., Alhazmi, S., Almasri, M., Gillani, S.: Blockchain-based applications in
education: a systematic review. Appl. Sci. 9(12), 2400 (2019)
3. Huynh, T.T., Huynh, T.T., Pham, D.K., Ngo, A.K.: Issuing and verifying digital certificates
with blockchain. In: 2018 International Conference on Advanced Technologies for
Communications (ATC), pp. 332–336. IEEE (2018)
4. BlockCerts to be developed in Malta. http://www.educationmalta.org/blockcerts-to-
bedeveloped-in-malta/
5. Sharples, M., Roock, R., Ferguson, R., Gaved, M., Herodotou, C., Koh, E., Kukulska-
Hulme, A., Looi, C.-K., McAndrew, P., Rienties, B., Weller, M., Wong, L.H.: Innovating
pedagogy 2016. Open University innovation report 5 (2016)
6. Case Study Malta Learning Machine. https://www.learningmachine.com/casestudies-malta
7. Li, Y., Liang, X., Zhu, X., Wu, B.: A blockchain-based autonomous credit system. In: 15th
International Conference on e-Business Engineering (ICEBE), pp. 178–186. IEEE (2018)
8. Turkanović, M., Hölbl, M., Košič, K., Heričko, M., Kamišalić, A.: EduCTX: a blockchain-
based higher education credit platform. IEEE Access 6, 5112–5127 (2018)
9. Ark: All-in-One Blockchain Solutions. http://www.ark.io
10. Lizcano, D., Lara, J.A., White, B., Aljawarneh, S.: Blockchain-based approach to create a
model of trust in open and ubiquitous higher education. J. Comput. High. Educ. 32(1), 109–
134 (2019)
11. Bandara, I.B., Ioras, F., Arraiza, M.P.: The emerging trend of blockchain for validating
degree apprenticeship certification in cybersecurity education (2018)
12. Green, N.C., Edwards, H., Wolodko, B., Stewart, C., Brooks, M., Littledyke, R.:
Reconceptualising higher education pedagogy in online learning. Dist. Educ. 31(3), 257–
273 (2010)
13. Jirgensons, M., Kapenieks, J.: Blockchain and the future of digital learning credential
assessment and management. J. Teach. Educ. Sustain. 20(1), 145–156 (2018)
14. Skiba, D.J.: The potential of blockchain in education and health care. Nurs. Educ. Perspect.
38(4), 220–221 (2017)
15. Martins, V., Oyelere, S.S., Tomczyk, L., Barros, G., Akyar, O., Eliseo, M.A., Amato, C.,
Silveira, I.F.: The microsites-based blockchain ecosystem for learning and inclusion. In:
Brazilian Symposium on Computers in Education (SBIE), pp. 229–238 (2019)
16. Oyelere, S.S., Tomczyk, L., Bouali, N., Agbo, F.J.: Blockchain technology and gamification
– conditions and opportunities for education. In: Veteška, J. (ed.) Adult Education –
Transformation in the Era of Digitization and Artificial Intelligence. Andragogy Society,
Prague (2019)
17. Grech, A., Camilleri, A.F.: Blockchain in education. Publications Office of the European
Union, Joint Research Centre (2017)
18. Arenas, R., Fernandez, P.: CredenceLedger: a permissioned blockchain for verifiable
academic credentials. In: IEEE International Conference on Engineering, Technology and
Innovation, pp. 1–6. IEEE (2018)
19. Ocheja, P., Flanagan, B., Ueda, H., Ogata, H.: Managing lifelong learning records through
blockchain. Res. Pract. Technol. Enhanc. Learn. 14(1), 1–19 (2019)
20. Tomczyk, L., Oyelere, S.S., Puentes, A., Sanchez-Castillo, G., Muñoz, D., Simsek, B.,
Akyar, O.Y., Demirhan, G.: Flipped learning, digital storytelling as the new solutions in
adult education and school pedagogy. In: Veteška, J. (ed.) Adult Education – Transformation
in the Era of Digitization and Artificial Intelligence. Czech Andragogy Society, Prague
(2019)
Digital Storytelling in Teacher Education
for Inclusion
Özgür Yaşar Akyar1(&), Gıyasettin Demirhan1(&),

Solomon Sunday Oyelere2(&), Marcelo Flores3(&),
and Vladimir Costas Jauregui3(&)
1
{ozguryasar,demirhan}@hacettepe.edu.tr
2
[email protected]
3
[email protected],
[email protected]
Abstract. In this paper first we share concept of workshop-based digital sto-

rytelling which we adopted for an International project called SELI (Smart
Ecosystem for Learning and Inclusion) and its educational value in terms of
inclusion. Secondly we give information about the context of teacher education
in the case of Turkey and Bolivia by explaining the context of different target
groups as physical education teachers and people from alternate cultures where
the heritage written is not important. Finally, we share the architecture of our
digital storytelling solution comparing the SELI proposal. Many solutions
implemented and SELI proposal included, try to resolve the DST (Digital
Storytelling) like a simple tool with a solution within a learning framework that
support it, versus a proposal with a strong scope of conceptualization without a
framework, making possible multiples uses not only within learning contexts.
Keywords: Digital Storytelling Teacher education Physical education

Indigenous collective memory
1 Introduction
There are various ways of using Digital Storytelling (DST) as an educational tool in the
field of education including pre-school, K-12, higher education and non-formal edu-
cation. As we discussed in another paper digital stories can be created both by teachers
and students in a formal education [1]. Therefore, similar researches made in the field
of higher education provide evidence for the expected results of our study. For
example, a study implemented with college students from the Industrial Design pro-
gram reflects benefits of using digital storytelling as the authentic learning, the polished
end products, the engagement of students with the material, the decidedly independent
learning, and for the collaborative practice highlighted [2]. In another study,
researchers developed a digital storytelling system called Digital Storytelling Teaching
System-University (DSTS-U) in order to help college students to quickly create stories
368 Ö. Y. Akyar et al.
with a structural architecture and enhance the variety of the contents of stories through
different story structures. In this study researchers see DST not only useful for skill
development but also it provides learning from experience as it allows listening and
sharing together.
2 Digital Storytelling Support for Teacher Who Provides

Inclusive Education
Burgess, argues that debates about the digital divide based on the difficulty of access to
ICT have shifted towards concerns about social inclusion and inequality in access to
“voice” [3]. We cannot simply expect from disadvantaged groups to provide inclu-
siveness on their own. It can be said that the institutions determining the educational
policy and the teachers who directly implement this policy have an important role.
Particularly institutions need to make sure quality teacher education can work with
disadvantaged groups.
Hargreaves and Fullan reminds teachers are not only need to have knowledge and
skills but also able to create trust-based relation with others and to have judgmental
skills [4]. They call the combination of these three capitals Professional capital. In order
to contribute to the continuing professional development of this multi-faceted capital of
the teacher, the opportunities provided by new technologies to create participatory and
inclusive learning communities can be utilized by taking into consideration the
opportunities provided by today’s rapid changes and developments in teacher educa-
tion. In [4] states that in order to improve teachers and teaching, the conditions in
which teachers are involved and the communities and cultures in which they are part
should be improved.
Therefore, it can be foreseen that workshop-based DST may contribute to the
improvement of the cultures in which prospective teachers and teachers are present by
allowing the creation of a climate of trust and telling experiences and sharing of
experiences with the wider ecosystem. By using DST in teacher education, a positive
contribution can be made to the prospective teachers’ learning and active participation.
Therefore, DST can be used as a means of empowering prospective teachers to build
their stories based on their specific contexts so that they can reflect on their own
experiences and engage in constructive actions for educational transformation. This
corresponds to the autobiographical learning described by Rossiter and Garcia [5] as
the third use of DST. In particular, as a change agent, prospective teachers may be
provided with the opportunity to experience active participation that the main source of
motivation is not the satisfaction of school principal but directly contributing to the
student’s life. In this context, DST can provide an empowering resource to enhance the
professional capital of the teacher through sharing experiences. In this regard, we can
say DST has great potential for supporting teachers and prospective teachers’ learning
as an educational empowerment tool.
Digital Storytelling in Teacher Education for Inclusion 369
3 Creating and Sharing Digital Stories in Context of Active

Quality Living Research Guidance and Discovering Roots
of Indigenous Cultures
Randall tries to clarify the strong relationship between story and life by saying that life
is never given, it is always partially created, built, re-created, just like the story [6].
When these statements are combined with John Dewey’s statement, “Education is life
itself” we can simulate that education is an art and teacher is a designer/artist and
should use his/her creativity continuously. Therefore, teachers may need to be able to
produce innovative and creative educational activities in order to create inclusive
learning environments for the needs of students of different characteristics. As Randall
quoted researchers such as John Dixon and Leslie Stratta, who followed John Davey’s
philosophy in the field of education describe narrative as the main action of the mind
and telling as “as a basic human trait” which is an indispensable way of making human
experience meaningful [6]. His explanations to write the poetics of learning provide a
very important resource for a researcher interested in the professional development of
teachers. He argues that the mentoring approach primarily uses a version of the “story”
model. The basic assumption here is that consciousness rises, knowledge is created,
society is created and a perspective with transformative powers is established by
sharing personal and public stories. We need to highlight that not only the story, but
also the process of story formation is necessary. Therefore, our basic assumption is that
the teachers and prospective teachers, have an active role in improving the education
system to be more inclusive in terms of creating learning opportunities and will con-
tribute to the improvement of the quality of life of the individuals with their stories. In
addition, there is a need for digital story to be used beyond self-expression and
communication, as Hartley reminded digital media should be used to create a new
target, definition and imagination [7]. It is precisely at this point that the use of DST by
physical education teachers and students in the context of active quality life research
offers the goal of creating a new world design. Because, education starts with education
of the body. Moreover, the majority of physical education and sports activities are
based on learning by doing, and each learning process creates a story of its own. On the
other hand, it can be said that human life is based on movement. However, cognitive
and affective domains cannot be denied. This perspective leads us to serve holistic
development. These are the main components of active and quality life, and this can be
achieved through holistic development. The Quechua and Aymara peoples build a
collective memory through stories of oral tradition. These peoples do not have a written
tradition; they have begun to recover the stories in writing only in the twentieth century
with higher intensity. These stories are not considered as memories, instead of as the
history and thought of the people. Its conception is of transmission of values and
teachings that go form the worldview, philosophical, religious, economic, artistic,
technological, and political knowledge of an entire culture. These oral stories also
make up the social order in the town. A story that is part of the oral tradition is a
complex construction of language and does not require writing. The story demands a
skilled narrator, who knows the tradition and guarantees the transmission of the most
in-depth ideas present in the story. In the stories, these native peoples create a link
between the past and the future. The past is interpreted and chained to the interpretation
of current actions (the present) to project into the future. This projection into the future
corresponds to the people as a whole. The story is a circular experience for these
cultures. That is transforming the past into a present continuum. Tradition and its
experiences are always current according to the social life of the people. In the nar-
ration, the opening and ending of a story break with the temporal boundary. The
temporal boundary break is between when happened the story and the time it is
narrated. These two moments have no distance; on the contrary, the events can be
going on in the present at the moment of the story. According to [8] coloniality refers to
the unique patriarchal power of western expansion against the original people. This
coloniality point to the idea of differentiating races: a superior and a lower one.
Superiority is transferred to all areas like knowledge, society, work. The result is the
hegemony of thinking and building from the cultural approach of those who conquered
the indigenous. According to [8] “The construction of knowledge is a complex situ-
ation that requires the rupture of the dominant culture”. At this point, the traditional
story telling of the Aymara and Quechua peoples, in their conception of collective
memory, has maintained a space of rupture with the forms of Western thought.
According to [9] the native people (Quechua and Aymara, among others) in Bolivia
have been victims of non-national and non-democratic states. Victims in the sense of
freedom to develop a culture. They seek democratization by creating another state
approach that includes their history, which flows as the oral narration in their villages,
so far closed in written sources of western tradition. According to [10], in the 2001
census, 62.2% of the Bolivian population declares that they belong to some of the
original peoples: Aymara, Quechua, between others. In the 2012 census, referred to
[11], the population identified as part of an original inhabitant is 40.6%, a reduction
near to 20% since the 2001 census. Some people did not declare belonging to original
inhabitants due to the omission of the “mestizo” option (eliminated because it had a
pejorative concept in Bolivia). The population in Bolivia is approximately 10,896,000
inhabitants in the 2015 Household Survey [12], 31.5% of Bolivians live in the rural
area. The inhabitants of the rural area, for the most part, belong to some native people.
The native language is the most spoken in rural areas (46.4% of rural inhabitants
declare to speak Spanish). Quechuas and Aymaras can get rid of stories in a digital
format than technologies that favor writing. This empowerment will improve the
possibilities to remain in the time the collective oral memory. The chances of this
culture to accept technology help is digital storytelling; because digital storytelling is
the expressiveness approach likely to his oral experience, they have a preference for the
oral transfer of knowledge and history. The oral narrative helped by DTS will promote
re-discovering roots for people living in Bolivia, and also alleviate the misunder-
standing between indigenous and people living in cities. It also will help the second
generation of rural area inhabitants who migrate to cities know about his roots. The
empowerment of digital narration as a mechanism of education and inclusion for
cultures originating in Bolivia relies on the PRONTIS Program (Programa Nacional de
Telecomunicaciones de Inclusión Social-Prontis) [13], which in its first stage aims to
reduce the digital divide in connectivity.
4 Tool Development Based on Workshop Digital Storytelling
SELI (Smart Ecosystem for Learning and Inclusion) project is based on situative and
sociocultural perspectives [14–17] to understand teacher learning in a digital
storytelling-embedded learning ecosystem instead of conceptualizing learning as
changes in an individual’s mental structure, we consider “learning by individual in a
community as a trajectory of that person’s participation in the community a path with a
past and present, shaping possibilities for future participation” [16] by calling from
[18]. Therefore, we prefer implementing tool based on workshop based digital story-
telling as a process rather than various examples of using digital storytelling as a
product or tool. Workshop-based digital storytelling practices are used in higher
education ecologies as a co-creative process in which six main stages of the workshop
process defined by the six following phases defined by Lambert [19], see Fig. 1:
Fig. 1. Phases of workshop-based digital storytelling.
We aim to bridge the gap of transferring educational experience in to real life

situations through workshop-based digital storytelling as it is a natural way to exchange
experiences between the educator and the student throughout multimedia which brings
both audio and visual communication together. In this sense, digital storytelling tool
can allow both educators and students to learn from each other’s experiences through
stories, lives in an authentic way. In [20], the architecture of SELI shows the two main
concepts. The first concept is Web Accessibility supported by “Web Content Acces-
sibility Guidelines” (WCAG 1.0), in order to get accessibility for anyone; the second
one refers to architectural design, the microsite. The concept of the microsite will
provide to the DST the feature of a self-contained entity. In this way, the tool to carry-
out the story is an activity inside the microsite.
5 SELI Platform Storytelling Implementation

5.1 First Approach Towards Storytelling Tool
The logical view for the Storytelling Component is simple and has only two compo-
nents: The Story component in the client-side, and the story telling tool component in
the server-side. The client manage the actions related to the creation, editing, publishing,
and play of a Story. The Server manage the persistent documents which represent a
Story and the document communication with the client. Meteor.js manage DDP (Dis-
tributed Data Protocol) and API Rest (by HTTP request-response) for communications
between Server and Client, the first one is useful for data related with MongoDB
documents (in the case of study the documents are Stories or Scenes of a Story) (Fig. 2).
Fig. 2. Logical architecture view for first approach.
The first approach for Digital Storytelling design shows a Story compound by a
Scene sequence. The Scene has two media type resources and a text description to
represent the author’s socio-cultural expression in the story. The concrete implemen-
tation is under Meteor.js framework using Material-IU and React.js as components for
the client side. In the server side the Story and each Story-Scene is a javascript object
made persistent as MongoDB compound document.
The class diagram for Digital Storytelling design (See Fig. 3) shows a Story
compound by a Scene sequence. The Scene has two media type resources and a text
description to represent the author’s socio-cultural expression in the story. The concrete
implementation is under Meteor.js framework using Material-IU and React.js as
components for the client side. In the server side the Story and each Story-Scene is a
javascript object made persistent as MongoDB compound document.
Fig. 3. Class diagram of first approach.

The classes observed in the Fig. 3 represents the design for the component Sto-
rytelling Tool in the server side Meteor.js implementation. The Story is an activity in
the platform. As an activity is part of a Course.
This implementation does not support the upper story circle stages (Story circle,
text, and in group screening). The three upper story circle are high collaborative
activities usually taken face-to-face. Meanwhile the lower three circles (Voice
recording, Images, Images and voice) can be done by only one student/editor. The
recording voice, uploading image, linking of image-voice scenes and finally publishing
the story is made by the tool implemented; the publishing action let other users (stu-
dents and teachers) to view and play the story, the collaborative screening with feed-
back is missing in this naive approach for digital storytelling approach in the platform;
but it not prohibits a face-to-face meeting to screening and feedback activities.
5.2 Second Approach for Storytelling Concept

This is a logical architecture view of storytelling tool that considers three main aspects:
A Client module, to manage requirements from storyteller. Requirements from story-
teller like: Creating new stories containing all new elements about: text, audio, video,
voice (from voice device), image and user events (pause, timing, etc.). A Board
Manager, to manage more complex requirements over the set of stories: test, review,
sharing and querying, like: reviewing syntax and semantics of story sequences, change
or alter execution sequences of stories, merge or reuse part or entire stories to design a
new one, share or broadcast to storytelling community (Fig. 4).
Fig. 4. Logical architecture view for alternative approach.
Allows to reuse and add stories by several storytellers designing story circles each
one with an own contribution to the stories in a collaborative way in a workshop. An
Active Repository allows to manage persistence and to implement flow of the trans-
action behavior (Fig. 5).
Fig. 5. Physical architecture of second approach.
The physical architecture forecast some physical components in addition to the

logical view, the Board manager can be implemented with a compiler, implementing a
specific language to manage the stages of build stories with a controlled syntax,
sequences and semantic of the elements that compose the stories. Both architectures
reflect the storytelling concepts. However, there are some differences at time to answer
some requirements from final users. The first model is the design of a quick devel-
opment thought out for a tool for rapid development of software. The second one is
designed in terms of strength instead of velocity. The first one, was implemented with
Meteor (JavaScript) and MongoDB (database) persistence tool. And was developed in
a rapid way from the developer community answering requirements in a visual and
functional way notoriously quickly. The second one, still in the implementation stage,
(although both start at the same time approximately) using several programming lan-
guages and technologies like Java (client app), Java Script (Web component) and
PostgreSQL for active persistence.
The first one contains some gaps with the user requirements at time to response
some requirements like:
“Using some part of stories from the other’ stories?” – “Reusing some primary
elements of stories from other’s stories?” – “Redesign the execution flow of stories?”
These basic requirements which are not resolved at the first approach can be
managed smartly by the second approach due to the architectural basis and the con-
ceptualization of stories through more powerful elements.
The final evaluation of the second approach will be in another investigation about
the conceptualization of digital storytelling and the technological approach chosen.
6 Conclusion
In conclusion, we discussed the use of DST in the educational context and the use of
DST as both a pedagogical strategy and a research method for the training of teachers,
who play a very important role in the education system.
The novelty we have added to the discussions in the field of literature on the
training of prospective teachers is the creation of a story worlds provided through
workshop based DST. First story world we call is Active Quality Life Research
Guidance, which allows sharing the stories of teachers and learners regarding active life
relevant as a learning area of physical education. Second story world is Discovering
Roots with a digital storytelling resembling oral narrative. It will allow stories of
Quechua people regarding their roots and cultures for promoting intercultural learning
and going towards inclusion of his history and thoughts in the Bolivia state, and of
course together with many other cultures wide in the world.
Secondly, we share an architectural view of DST to explain how we aim to handle
the digital aspect of the DST process together its conceptualization and intercultural
context. DST has many variants of conceptualization and many technological
approaches. Someone’s conceptualizations points to reach different approaches.
The analysis for different approaches, architectures and situations shows that is
important to define first the conceptualization and then works for design it. DST tool is
not only interesting but also inspiring for teachers to use DST in their classroom.
Researchers who attempt to use DST for literacy teaching of students discovered use of
technology can be a game changer in the classroom as it changed the mood of the
students just by modifying the way to do it, from physical cards to digital ones [21].
The analysis for different approaches, architectures and situations shows that is
important to define first the conceptualization and cultural approach before to design it.
SELI Project as a trans-national project contributes to this process by exploring distinct
approaches for a digital storytelling tool in order to enhance inclusion in education. The
strong communication architecture and knowledge exchange between owner of con-
ceptualization and developers is a key during the development of SELI DST tool.
SELI project initiatives in the Digital Storytelling approach for education and
inclusion, is the presentation of the approach with pre-service teachers and Quechua
spoken teachers validates the acceptation and appropriation of DST, in both: education
and cultural indigenous empowerment.
received funding from the European Union’s Seventh Framework Program. Project Smart
Ecosystem for Learning and Inclusion – ERANet17/ICT-0076SELI.
References
adult education and school pedagogy. In: Jaroslav, V. (ed.) Adult Education (2018) –
Transformation in the Era of Digitization and Artificial Intelligence. Ceská and ragogická
společnost/Czech Andragogy Society Praha, Prague (2019). ISBN 978-80-906894-4-2
2. Barnes, V.: Telling timber tales in higher education: a reflection on my journey with digital
storytelling. J. Pedag. Dev. 5(1), 72–83 (2015)
3. Burgess, J.: Hearing ordinary voices: cultural studies, vernacular creativity and digital
storytelling. Continuum 20(2), 201–214 (2006)
4. Hargreaves, A., Fullan, M.: Professional Capital: Transforming Teaching in Every School.
Teachers College Press, New York (2012)
5. Rossiter, M., Garcia, P.A.: Digital storytelling: a new player on the narrative field. New Dir.
Adult Continuing Educ. 126, 37–48 (2010)
6. Randall, W.: Bizi Biz Yapan Hikayeler. Ayrıntı Yayınları, Istanbul (2014)
7. Hartley, J., McWilliam, K.: Story circle. Wiley-Blackwell, Chichester (2009)
8. Roque, P.: Relato oral en la construccion de saberes y conocimientos de la cultura Aymara.
Escuela Superior de Formacion de Maestros - THEA. http://unefco.minedu.gob.bo/app/
dgfmPortal/file/publicaciones/articulos/ae2465defbefd1aa87d17dd4d146b966.pdf. Accessed
4 Dec 2019
9. Estudios Latinoamericanos: La tradicion oral, estudio comparativo indigena Mexico-Bolivia
(2010). https://cidesespacio.blogspot.com/2010/12/la-tradicion-oral-estudio-comparativo.ht
ml. Accessed 4 Dec 2019
10. Comision Economica para America Latina y el Caribe: Porcentaje de Poblacion Indigena.
https://celade.cepal.org/redatam/PRYESP/SISPPI/Webhelp/helpsispi.htm#porcentaje_de_
poblacionindig.htm. Accessed 3 Dec 2019
11. Centro de Estudios Juridicos e Investigacion Social: Bolivia Censo 2012: Algunas claves
para entender la variable indígena (2013). http://www.cejis.org/bolivia-censo-2012-algunas-
claves-para-entender-la-variable-indigena/. Accessed 3 Dec 2019
12. Instituto Nacional de Estadística, INE: Censo de Poblacion y Vivienda 2012 Bolivia,
Características de la Población (2015). https://www.ine.gob.bo/pdf/Publicaciones/CENSOP
OBLACIONFINAL.pdf. Accessed 30 Nov 2019
13. Ministerio de Obras Públicas, Servicios y Vivienda, PRONTIS, Bolivia: “Plan Estratégico de
telecomunicaciones y TIC de inclusión social 2015–2025” (2014). http://prontis.gob.bo/
infor/PlanEstrategicodelPRONTIS.pdf. Accessed 21 Nov 2019
14. Simsek, B., Usluel, Y.K., Sarıca, H.C., Tekeli, P.: Türkiye’de Egitsel Baglamda Dijital
Hikaye Anlatımı Konusuna Eleştirel Bir Yaklaşım. Egitim Teknolojisi Kuram ve Uygulama
8(1), 158–186 (2018)
15. Greeno, J.G.: Learning in activity. In: Sawyer, R.K. (ed.) The Cambridge Handbook of the
Learning Sciences, pp. 79–96. Cambridge University Press, NewYork (2006)
16. Greeno, J.G., Gresalfi, M.S.: Opportunities to learn in practice and identity. In: Assessment,
Equity, and Opportunity to Learn, pp. 170–199 (2008)
17. Lave, J., Wenger, E.: Situated Learning: Legitimate Peripheral Participation. Cambridge
University Press, Cambridge (1991)
18. Kang, H.: Preservice teachers’ learning to plan intellectually challenging tasks. J. Teach.
Educ. 68(1), 55–68 (2017)
19. Lambert, J.: Digital Storytelling: Capturing Lives. Creating Community. Routledge,
Abingdon (2013)
20. Martins, V., Oyelere, S.S., Tomczyk, L., Barros, G., Akyar, O.Y., Eliseo, M.A., Amato, C.,
Silveira, I.F.: The microsites-based blockchain ecosystem for learning and inclusion. In:
Brazilian Symposium on Computers in Education (SBIE), pp. 229–238 (2019). ISSN 2316-
6533. https://br-ie.org/pub/index.php/sbie/article/view/8727. http://dx.doi.org/10.5753/cbie.
sbie.2019.229
21. Flórez-Aristizábal, L., Cano, S., Collazos, C.A., Benavides, F., Moreira, F., Fardoun, H.M.:
Digital transformation to support literacy teaching to deaf Children: from storytelling to
digital interactive storytelling. Telematics Inform. 38, 87–99 (2019)
In Search of Active Life Through Digital
Storytelling: Inclusion in Theory and Practice
for the Physical Education Teachers
Burcu Şimşek(&) and Özgür Yaşar Akyar(&)

{bsimsek,ozguryasar}@hacettepe.edu.tr
Abstract. This paper discusses the potential of implementing workshop-based

digital storytelling for enhancing the physical education teacher education to be
more inclusive. First, overall information about the uses of digital storytelling
workshops in the higher education settings is explained. Then the detailed
process about the facilitation of digital storytelling workshop with physical
education teachers is provided. Then workshop-based digital story telling
practice with a significant focus on inclusion is explored through the lenses of
theory of individual quality of life to inspire physical education teacher edu-
cation program developers.
Keywords: Digital storytelling Inclusion Physical education teacher

education Professional development
1 Introduction
The digital storytelling “movement” has been around for a long time [3] and digital
storytelling workshops have been in use in the higher education contexts both for
teaching, learning and research purposes worldwide. This paper discusses the potential
of using workshop based digital storytelling for the developing an understanding about
inclusion in the context of physical education teacher education. First, we give a brief
overview about the uses of digital storytelling workshops in the higher education
settings. Then we provide the details about the digital storytelling workshop that we
facilitated with physical education teachers. Then we suggest using digital storytelling
workshops in the curriculum of Sport Sciences undergraduate program, that aims to
train physical education teacher education. Here we connect the discussion to the
overall existence of inclusion topic in the course content, relating to the program
competencies matrix, that is supposed to be met by all of the higher education programs
in Turkey in line with the Bologna process. Doing so, we take a close look at the
current courses that might be related to inclusion issues in particular. In this attempt, we
try to attract attention to the importance of inclusion in practice through the circulation
of experiences, in this case in-service physical education students.
Digital storytelling is in use as an education tool in various fields of education
including pre-school, K-12, higher education and non-formal education. Digital stories
can be created both by teachers and students in formal education using various
378 B. Şimşek and Ö. Y. Akyar
multimedia platforms and applications individually. On the other hand, workshop

based digital storytelling is used for collecting the self-reflective accounts of doctors in
training, for using such reflectives for healthcare reforms, for developing understand-
ings about social care for older people, for conducting intergenerational research and
implementations, and for engaging communities with disabilities [12], for attracting
attention to the hotspot issue related to gender issues [11], and for collecting and
circulating the experiences of migrants and refugees [15]. Therefore, similar research
made in the field of higher education provide evidence for expected results of our
study. On the other hand, there are a wider set of research that make use of storytelling
in the training programs. Another study involved pre-service physics teachers recog-
nized students’ achievement and interest in physics lesson for pointing out that digital
storytelling as a distance education tool may bring out remarkable contributions [6]. In
this study researchers see digital storytelling not only useful for skill development but
also it provides learning from experience as it allows listening and sharing together.
Therefore digital storytelling goes beyond just being an tool but provides an activity
stream for professional development that’s not only acquisition of new skills and
knowledge [16] but empowering teachers to rethink their practice in a community of
learners. Taking a look at these studies coming from both educational sciences back-
ground and social sciences perspectives, it is likely to say that digital storytelling
practices, both multimedia based and workshop based are in common use for engaging
with inclusion issues.
2 Digital Storytelling Workshops for Inclusion and Research
This paper finds its foundation in an interdisciplinary approach as the researchers in this
paper come from social sciences and education sciences backgrounds. We take
“learning by individual in a community as a trajectory of that person’s participation in
the community—a path with a past and present, shaping possibilities for future par-
ticipation” [1]. Therefore, we value the digital storytelling workshop process as a way
of informal learning opportunity from one another’s experiences as well as frames of
references as well as the digital stories that come out of this co-creative process.
Workshop-based digital storytelling practices are used in higher education ecolo-
gies as a co-creative process in which six main stages of the workshop process defined
as practiced and tailored according to the theme and the purpose of the practice [17].
The digital storytelling workshop is facilitated by a trained facilitator team with par-
ticipants willing to share their experiences and produce a digital story from these first
personal narratives (Fig. 1).
In Search of Active Life Through Digital Storytelling 379
Digital Story
Circle
Fig. 1. The digital story circle: the phases of a digital storytelling workshop [14].
A digital storytelling workshops starts with the story circle which is a dialogic
phase in which the participants share their stories in a setting facilitated by the trained
and experienced facilitators, opening up the circle through sharing their own stories. In
the stories circle, the participants are encouraged to tell a fragment of a personal
experience that will be later turned into a digital story, forming the foundations of their
digital story. This stage is a dialogic stage in which, no digital aspect is mentioned
rather the archaic storytelling practices are called in and practiced by the participants
through the facilitation process. Inclusion is in the core of the digital storytelling
workshops and the facilitators are the moderators for the equal say principles. The role
of the facilitators here is crucial as some voices are more dominant and willing to speak
whereas some others cannot create an opportunity to share their stories and need
encouragement to share their ideas. This core element in the facilitation process of the
workshop-based digital storytelling provides us the grounds for using the practice both
for exercising inclusion as well as sharing stories and learning from each other’s
experiences and lives. This exercise also gives us the opportunity to listen to various
experiences that are reflected in various forms of narratives rather than fitting into one
form. Once such a circle of listening and sharing is formed, creating an individual
digital story through the digitalization stages becomes an experience of collaboration,
in which trust and understanding is cultivated, rather than competition. The digital-
ization of each individual story in the workshop through the technical phases also
provide the facilitators the ground to encourage the participants to collaborate and share
their feedback on the digital story of the fellow participants. Then during the in-group
screening process, the final stage of a workshop, the story circle is completed through
watching the final version of the individual digital stories and sharing thoughts and
ideas about each other’s digital stories. This process also contributes to building up a
public conversation ecology. On the other hand, digital storytelling workshops provide
the social scientists a rich data set that can be collected with multiple methodologies.
Digital storytelling workshops has been used by Hacettepe University Faculty of
Communication Digital Storytelling Unit in its projects and the courses. Through the
research linked to digital storytelling workshops, the unit has been connecting the
academic sphere with various communities in and off campus, such as the LBGTI
communities, refugees, as well as NGOs focusing on gender equality and violence.
There has been interdisciplinary collaboration that has aroused through the Digital
Storytelling MA course that also serves as the facilitator training module for the
interested parties. With such a combination, the Unit connects the theory behind digital
storytelling for inclusion with communities, in other words the field. Close collabo-
ration with educational sicences started first with a PhD thesis completed recently by
Çıralı with the title “Teachers’ Professional Self-understanding and its reception by
prospective teachers through digital storytelling”. In this thesis, Çıralı focused on the
reflections of the teachers’ professional self-understanding and the reception of these
reflections by the prospective teachers [18]. It is important here to point out to the fact
that sharing experiences empower participants of the digital storytelling workshops as
they get the chance to reflect on their own personal experiences as well as get the
chance to listen to others in a setting the aim is to complete the task of creating a digital
story. The digital storytelling workshops are processes in which both the participants
are active members of a small community of practice for a day or two, depending on
the length of the workshop. This brings up to the connection of inclusion and being
active members of a community.
3 In the Search of Active Life Digital Storytelling Workshop
This study is part of ERANET-LAC project titled “Smart Ecosystem for Learning and
Inclusion” funded by the European Union. This project lays emphasis on the topic of
digital exclusion and the inaccessibility of education for the disadvantaged groups. As
they form sets of challenges that then, offers the potential for improving the digital
competences of teachers in the LAC and EU regions, which also have led to the
extensive participation of citizens who have relatively poor access to innovative
technologies involved in education, training and inclusion through ICT. Current
research aims at empowering physical education teachers throughout workshop-based
Digital Storytelling as one of the inclusive approach identified in the project. In the
Digital Storytelling Unit at Hacettepe University in October 2019, seven participants
joined the digital storytelling workshop facilitation team formed of Burcu Şimşek,
Özgür Yaşar Akyar, Şengül İnce and Çağrı Çakın. The participants are physical edu-
cation and sports pre-service or teachers who were interested in taking part in the
digital storytelling workshop that consists of 3 pre-service teachers (Göktuğ, Reyhan,
Zeynep), 2 Ph.D students (Emre, Nehir), 1 graduate student (Eren) studying in Physical
Education and Sports Teaching MA Program and one experienced physical education
teacher (Evren). The participants join the workshop filling in participation consent
forms.
This digital storytelling workshop was also the first facilitation experience of Özgür
Yaşar Akyar (second author of this paper) who took the Digital Storytelling course by
Burcu Şimşek (the first author of this paper) from the Communication Sciences MA
Program. The idea for this digital storytelling workshop was developed by Akyar who
holds a master’s degree in Sports Technology and continues his Ph.D. studies as a
research assistant in the Physical Education Sports Teaching Department. He offers

training courses as part of non-formal education. He is in search of improving the
quality of life of individuals through digital solutions and he is particularly interested in
using digital storytelling as a way of enhancing inclusion. So was the expertise of
Şimşek on inclusion and digital storytelling in use as the lead facilitator and researcher.
The theme of this workshop was agreed on to be “active living” in order to seek
what sort of stories will fill in this wider theme. According to Physical Education
Sports Teaching program one of the core learning area is called “active and healthy
living” learning area in Turkey. Active living in the context of physical education sport
course is understood as physical activity and physical education sports curriculum
includes mostly physical activity related outcomes [8]. Physical inactivity is discussed
as one of the factor for decreasing quality of life [2, 4, 7, 9]. Therefore, it can be said
that physical education teachers have an important role for promoting active life styles
in early years in education which can result improving quality of life of individuals
through increasing physical activity. This brings us to the understanding about quality
of life.
Schalock and others conceptualize quality of life as being composed of eight core
domains that were initially synthesized and validated through an extensive review of
the international quality of life literature across the areas of Intellecual Disability
Disorder, special education, behavior and mental health, and aging [10]. These domains
are identified as independence, social participation and well-being factors. Indepen-
dence factor is composed of personal development and self-determination, social
participation is composed of interpersonal relations, social inclusion, and rights, and
well-being is composed of emotional, physical, and material well-being [13].
In the completion of the workshop, the digital stories were translated into English
and with the English subtitles, the digital stories have been uploaded to the official
website of Digital Storytelling Workshop Unit at Hacettepe University to the link:
http://www.digitalstoryhub.org/Aktif-Ya-ami-Aramak. The name of the workshop was
discussed and agreed on by the participants to be “In the search for active life”.
Taking the limitation of this paper into consideration, we take an overall look at the
content of the digital stories told in the digital storytelling workshop titled “In the
search for active life” and we aim to link these active life narratives to the quality of life
categories.
Digital Stories in this collection remind us of the individualized diverse nature of
one’s quality of life, in terms of what the person contributes to bringing about change,
and what person-centered supports and opportunities for an active life. Therefore,
digital stories of participants are discussed with lens of individual quality of life based
on eight domains which are based on three higher-order factors as independence
(composed of personal development and self-determination), social participation
(composed of interpersonal relations, social inclusion, and rights), and well-being
(composed of emotional, physical, and material well-being) [13].
3.1 Material Wellbeing

Material well-being has been mentioned slightly during the story circle phase of the
workshop by most of the participants. However, this factor has been pointed out as a
quality indicator that would be fulfilled anyhow.
3.2 Physical Wellbeing

One of the most important findings of this workshop was that our participants didn’t
have the intention of thinking about active life only as being physically being active.
“I don’t think that being active is only being physically active. We must be active in
cognitive, social or emotional terms, we need to be active citizens.”(Emre)1.
“My active life changed when I combined my 15-year volleyball history with
teaching.
But the movement always accompanied me.”(Eren)2.
“The comments on my profession like “You are a Physical Education teacher, you
are always moving, you are playing sports every day” did not reflect the truth.
Because I didn’t do sports regularly every day anymore, I couldn’t go out and walk, run
and see my friends. I had to get outside the daily routine and find the light that would
change my life that was trapped in the house, work, and traffic.”(Evren)3.
“I’m so used to this intense and active tempo that I would rather run to the nature
and go hiking than sleep in the off-season. And this is how I rest my soul actively.
(Zeynep)4.
All of these extracts show us that although the occupation of our participants are
related to physical performance as sportspeople, their understanding about active life
was not limited to only physical wellbeing.
3.3 Emotional Wellbeing

Our participants reflected on the emotional wellbeing as an important aspect of their
quality of life. Their stories also provided us the insight about how they make dis-
tinctions between several physical activities to fulfill their physical needs and their
emotional needs.
“When I came to the city, I felt bad. Because I was stuffed among apartment
buildings. My freedom was restricted.” (Emre).
“I would say that I live an active life with two dimensions in my life.
While Pilates relaxes in physical terms, dancing comforts me emotionally.”
(Reyhan)5.
1
https://vimeo.com/376768915.
2
3
4
5
“I believe a hectic life reduces the quality. For this reason, I always try to be
prepared for the places where I’m going. We can also call it the responsibility of being
an athlete.”(Zeynep)6.
3.4 Personal Development

The process of becoming a good person was reflected in one of the stories intensively.
We link this strong intention to giving value to personal development.
“I mean, trying to be a good person.”
“I have been physically active as a professional basketball player for a very long
time. I had to quit when I had an injury.” (Eren)
“I play Go to know myself and to understand life. I believe that it will make me a
better person.”(Eren).
3.5 Self-determination
In the stories of Evren, Nehir and Reyhan, we realised that the distance to self-
determination was in various levels depending on the challenges they faced.
“I had lost my independence like a child whose toys were taken away. The com-
ments on my profession like “You are a Physical Education teacher, you are always
moving, you are playing sports every day” did not reflect the truth.”(Evren).
“For me, active life means satisfying and clearing my mind, and participating in
environments where I feel free.”(Nehir)7.
“When I was at High School, I started to have problems with my back pain, and it
caused sensitivity particularly in my back. It also had a negative effect on my waist
flexibility.
Right now, I do Pilates for myself, and I’ve been teaching and included it in my
life.
I had difficulties at first, but in time, I started to feel positive effects on my body.
My Scoliosis levels improved.” (Reyhan).
3.6 Rights
Emre’s story pointed strongly to the rights and the conditions of living together. In
other words, Emre defined active life as being a responsible citizen, taking action once
faced with a public issue. In this respect, Emre’s story contributed to our argument
about active life that can not only be defined as being physically active but also has
close connections to social aspects.
“When I go jogging in the park, I immediately report the problems I see around me
to the Hello Blue Desk. Now the Blue Desk recognizes me, and they answer my phone
as “Emre Bey”. When I walk down a street and see a blown water valve or some
6
7
garbage, and when you report these, it is active citizenship, active life. Obviously, it is
our duty.”(Emre).
“In my opinion, active life equals active citizenship.”(Emre).
3.7 Interpersonal Relationship

Most of our participants reflected active life in relation to interpersonal relationships,
their connections to their friends and family as well as other communities. This aspects
is a cross-cutting component with emotional wellbeing for sure.
“My active life understanding is that being cognitively, socially and physically
active.”(Emre).
“I play Go, which outweighs the cognitive and social aspects.”(Eren).
“I believe that my active life is no longer my movements during the day, but the
time that is spent with family, friends and outside environment.” (Göktuğ)8.
“Active life is about being in a socially and culturally rich the environment in the
time after training for me.” (Zeynep)9.
“I enjoy chatting with my foreign friends and learning their culture while I learn
languages.”(Zeynep).
3.8 Social Inclusion

Looking at all different ways of being active reminds us that, there is not one way of
living but many ways of living diversity. The stories of our participants provide various
narratives of individuals in an occupational group, trained as physical education
teachers. Their experiences and their standpoints in life are different from one another.
“With the head of puerperality and the pressure of society for 1.5 year I lived a life
without sports.”
Göktuğ has emphasis on recognising diversity by saying “Actually, the active lives
of all people should be different.” (Göktuğ).
4 Conclusion Remarks: Connecting Active Living

to Inclusion in the Education of the Physical Education
Teachers
The United Nations Sustainable Development Goals10 aims to provide the member
states a frame for developing life for their citizens. Goal number 3 - Good health and
well-being, goal number 4 - Quality education, goal number 5 - gender equality, goal
number - 10 Reduced inequalities are directly connected to the wider issue of inclusion.
Once we focus on the meanings of the active life that are provided to us by our digital
storytellers in our Active Living Digital Storytelling Workshop, we realize that the
8
9
10
https://sustainabledevelopment.un.org/?menu=1300.
gender inequalities might affect woman’s engagement with being active physically for
themselves as their lives might be occupied with duties delegated to them due to sex
roles. Quality education seems to be the other important connection as all of our
participants find personal development as an important part of their active life. Overall,
good health and well-being in relation to active life is not seen only as being physically
well but also being well emotionally. Here it is important to point to the fact that well-
being is not a personal matter but a social one. Şimşek [19] points to the women’s well-
being in relation to political participation and social inclusion. In our case, being an
responsible and responsive citizen is one of the significant meanings of an active life.
This research on active life can help physical education teacher education program
developers to design inclusive learning settings through getting inspired by diverse
views of pre-service and in-service teachers. When we closely examined the Sport
Sciences Undergraduate Program at Hacettepe University, in which most of our par-
ticipants are either graduates or current students, we came across with the fact that the
courses that can contribute to the understanding of students about inclusion in relation
to their contribution to National Qualifications Framework of Higher Education
Council, that regulates the higher education programs according to learning outcomes,
are not directly about inclusion but about communication, social competences, work
related competences and learning competences. The courses that seem to contribute to
inclusion in the program are: Introduction to education, Instructional principles and
methods, Training Theory, Outdoor Sports, Drama, Critical Thinking, School Expe-
rience, Teaching Practice and Community Service. However, in none of these courses
digital storytelling workshops are used as a tool to ignite an inclusive ecology for
sharing learning and teaching experiences that shortens the distance between the parties
of education. As Şimşek et al. pointed out in an earlier study there needs to be more of
critical research on educational sciences in relation to digital storytelling where com-
munication sciences can provide alternatives [5].
received funding from the European Union’s Seventh Framework Program. Project Smart
Ecosystem for Learning and Inclusion – ERANet17/ICT-0076SELI.
References
1. Greeno, J.G., Gresalfi, M.S.: Opportunities to learn in practice and identity. In: Assessment,
Equity, and Opportunity to Learn, pp. 170–199 (2008)
2. Gu, X., Chang, M., Solmon, M.A.: Physical activity, physical fitness, and health-related
quality of life in school-aged children. J. Teach. Phys. Educ. 35(2), 117–126 (2016)
3. Hartley, J., McWilliam, K.: Story Circle. Wiley-Blackwell, Chichester (2009)
4. Heesch, K.C., van Gellecum, Y.R., Burton, N.W., van Uffelen, J.G., Brown, W.J.: Physical
activity and quality of life in older women with a history of depressive symptoms. Prev.
Med. 91, 299–305 (2016)
5. Şimşek, B., Usluel, Y.K., Sarıca, H.Ç., Tekeli, P.: Türkiye’de eğitsel bağlamda dijital hikâye
anlatımı konusuna eleştirel bir yaklaşım. Eğitim Teknolojisi Kuram ve Uygulama 8(1), 158–
186 (2018)
6. Kotluk, N., Kocakaya, S.: Researching and evaluating digital storytelling as a distance
education tool in physics instruction: an application with pre-service physics teachers.
Turkish Online J. Distance Educ. 17(1), 87–99 (2016)
7. Lok, N., Lok, S., Canbaz, M.: The effect of physical activity on depressive symptoms and
quality of life among elderly nursing home residents: Randomized controlled trial. Arch.
Gerontol. Geriatr. 70, 92–98 (2017)
8. MoNE program. http://mufredat.meb.gov.tr/Dosyalar/2018120201950145-
BEDENEGITIMIVESPOROGRETIMPROGRAM2018.pdf. Accessed 5 Dec 2019
9. Rödjer, L., Jonsdottir, I.H., Börjesson, M.: Physical activity on prescription (PAP): self-
reported physical activity and quality of life in a Swedish primary care population, 2-year
follow-up. Scand. J. Primary Health Care 34(4), 443–452 (2016)
10. Schalock, R.L., Verdugo, M.A., Gomez, L.E., Reinders, H.S.: Moving us toward a theory of
individual quality of life. Am. J. Intellect. Dev. Disabil. 121(1), 1–12 (2016)
11. Şimşek, B.: Enchancing women’s participation in Turkey through digital storytelling.
J. Cult. Sci. 5(2), 28–46 (2012)
12. Jamissen, G., Hardy, P., Nordkvelle, Y., Pleasants, H.: Digital Storytelling in Higher
Education - International Perspectives. Springer, Cham (2017)
13. Schalock, R.L., Verdugo, M.A.: Handbook on Quality of Life for Human Service
Practitioners. American Association on Mental Retardation, Washington, DC (2002)
14. Şimşek, B.: Hikâye anlattıran, Hikâyemi Anlatan, Kendi Hikâyesini Yaratan Çember. In:
Ergül, H. (der.) Sahanın Sesleri. İstanbul Bilgi Üniversitesi Yayınları, İstanbul (2013)
15. Şimşek, B.: İletişim çalışmaları bağlamında dijital hikâye anlatımı: Kavramlar ve Türkiye
deneyimi. Alternatif Bilişim, İstanbul (2018)
16. Vescio, V., Ross, D., Adams, A.: A review of research on the impact of professional learning
communities on teaching practices and student learning. Teach. Teach. Educ. 24, 80–91
(2008)
17. Lambert, J.: Digital Storytelling: Capturing Lives. Creating Community. Routledge,
Abingdon (2013)
18. Çıralı Sarıca, H.: Öğretmenlerin Dijital Hikâye Anlatımı Üzerinden Mesleki Kendini
Anlayışları ve Öğretmen Adaylarınca Alımlanması (2019). http://openaccess.hacettepe.edu.
tr:8080/xmlui/handle/11655/8043. Accessed 5 Dec 2019
19. Şimşek, B.: Digital storytelling for women’s well-being in Turkey. In: Dunford, M., Jenkins,
T. (eds.) Digital Storytelling: Form and Content. Palgrave Macmillan, London (2018)
Accessibility Recommendations for Open
Educational Resources for People
with Learning Disabilities
Valéria Farinazzo Martins1,2(&), Cibelle Amato2, Łukasz Tomczyk3,

Solomon Sunday Oyelere4, Maria Amelia Eliseo1,
and Ismar Frango Silveira1
1
Computing and Informatics Department, Mackenzie Presbyterian University,
São Paulo, Brazil
{valeria.farinazzo,mariaamelia.eliseo,
ismar.silveira}@mackenzie.br
2
Developmental Disorders, Mackenzie Presbyterian University,
São Paulo, Brazil
[email protected]
3
Faculty of Social Science, Pedagogical University of Cracow, Cracow, Poland
[email protected]
4
School of Computing, University of Eastern Finland, Joensuu, Finland
[email protected]
Abstract. In order to contribute to the increasing inclusion of people who have

been for a long time out of society, it is possible to construct accessible didactic
material for specific audiences, such as for people with learning disabilities, or
other barriers to the full achievement of their learning processes, for instance,
elderly, deaf, and visually impaired people. This paper aims to contribute to the
area of accessibility by presenting a set of recommendations for authors of Open
Educational Resources that are not necessarily specialists in ICT, in order to
help the process of providing more accessibility for people with learning
disabilities.
Keywords: Learning disabilities Accessibility Open educational resources

Universal design for learning
1 Introduction
From the earliest years of life, the human being acquires knowledge through learning.
According to [1], learning is a necessary and universal process for the development of
culturally organized and particularly human psychological functions. Regarding formal
education, it has to be pointed out that access to learning is a right for all, regardless of
ones disabilities.
On the other hand, learning disabilities are related to significant difficulties in the
acquisition and use of writing, speaking, listening, reading and mathematical problem
solving skills [2, 3]. Despite concerns about improving the theoretical foundation and
388 V. F. Martins et al.
attempts to increase the quality of teacher education, there are still high rates of
unattended children with learning disabilities.
Many children present specific learning disabilities, such as dyslexia, dysgraphia
and dyscalculia. Research by the National Center for Education Statistics (NCES) in
the US, indicates that there are 34% of students aged 3 to 21 who have specific learning
disability [4]. Schools have the mission of bringing knowledge to each child, with a
unique cognitive and genetic profile, maximizing their skills and knowledge. Thus,
children with learning disabilities should receive attention that minimizes their dis-
abilities. Therefore, using Universal Design for Learning [5], combined with Infor-
mation and Communication Technology, seems to be a way to address the issue of
exclusion of people with disabilities.
The different modes of learning showed that students have specific needs to make
learning effective. The Universal Design for Learning intends to make the school
curriculum more flexible to meet specific learning needs of the students, i.e. their skills
and knowledge, as well as their experiences. Prioritizing a set of principles intended to
provide students with the same opportunities to learn but focusing on the inequalities of
the individual in relation to their skills, needs and interests [5]. With technology as an
ally, it sets out to adopt the most efficient and appropriate materials and methods to
reach all students. The combination of different media in content transmission supports
the development of flexible learning content that can meet the different learning needs
of students.
Besides, the adoption of Open Educational Resources (OER) brings a whole new
scenario of possibilities for adapting already existing content to meet specific
requirements [6]. By dwelling on open licenses and formats, OER makes it possible to
reduce adoption costs and makes more feasible the process of design and deliver
courses that comply with specific accessibility needs.
In this context, the aim of this paper is to present recommendations for the con-
struction of accessible OER for people with learning disabilities, aimed at teachers with
or without previous ICT knowledge. When designing accessible OER, it is important to
know the students’ profile and to establish the limitations arising from learning
difficulties.
This paper is structured as follows. Section 2 provides the necessary background
for understanding this paper’s context: developmental disorders, universal design for
learning; accessibility guidelines and related work. Next, in Sect. 3 are the materials
and methods. Section 4 provides accessibility recommendations. Finally, in Sect. 5,
some final conclusions of this paper are drawn.
2 Background
2.1 Learning Disabilities
Learning disabilities are relatively common conditions and refers to a heterogeneous
group of disorders that manifest as significant difficulties in the acquisition and use of
writing, speaking, listening, reading and mathematical problem solving skills [2, 3].
According to the International Classification of Diseases (ICD) learning disability is
Accessibility Recommendations for Open Educational Resources 389
considered a condition of interrupted or incomplete development in cognitive func-

tioning or adaptive behavior in the developmental period [7]. The Diagnostic and
Statistical Manual of Mental Disorders, Fifth Edition - DSM-5 [8], the main reference
in professional practice and research in this field includes difficulties in writing, reading
and calculating, as well as difficulties not specified in a category called Specific
learning disorder. The manual classifies the disorder as mild, moderate and severe.
It is noteworthy that the specific learning disorder, according to DSM 5, to be
characterized as such, it is necessary to identify one of the symptoms described in the
manual and that this symptom persists for at least six months despite possible inter-
ventions [8].
These disorders are understood to be intrinsic to the individual, supposedly due to
central nervous system dysfunction, there may be accompanying disabilities (e.g.
intellectual disabilities, severe emotional disorders, and sensory deficits) and may occur
at any time during a person’s life. However, extrinsic circumstances arising from an
individual’s surrounding context, such as cultural differences, inadequate teaching, or
the presence of comorbid conditions such as ADHD, can have a strong influence on the
diagnosis and progress of learning disabilities [8, 9].
In addition to interfering with learning basic skills such as math, reading and/or
writing, processing problems can interfere with higher-level skills such as attention,
organization, time planning, long or short-term memory. Learning disabilities can affect
an individual’s life beyond academics and can impact relationships with family,
friends, and the workplace [9].
Many studies have been conducted over the last decades to understand the basis of
these neurodevelopmental disorders, leading to the identification of some altered
specific neural networks, although there is still no complete understanding of the
mechanisms involved [10–12].
The lack of consensus on the conceptualization of learning disabilities and the use
of different models to identify learning disabilities are pointed as possible reasons for
the small number of studies on the effects of the implementation of prevention pro-
grams. Even in this scenario, the prevention of learning disabilities is a topic of great
relevance in the clinical and educational areas [13].
Even if there is no consensus on the conceptualization as difficulty or disorder, it is
important that intervention proposals help reduce the negative impact of poor school
performance, social and emotional aspects beyond learning.
2.2 Universal Design for Learning

From the concept of Universal Design [14], in the 1980s, used in the educational
context, came the concept of Universal Design for Learning (UDL). UDL is a proposal
that aims to ensure access to content for all students, regardless of their limitations, so
that the goal of education has shifted from knowledge acquisition to student
experience.
According to [5], UDL consists of a set of principles that constitute a practical
model in order to maximize learning opportunities for all students. These principles are
based on neuroscience and the use of media to help educators reach all students by
adopting appropriate learning objectives, choosing and developing efficient materials

and methods, and building a fair and accurate ways to measure student progress.
In order for students to have access to knowledge, there must be some changes in
four aspects of the curriculum: 1) Goals: listing the knowledge and skills necessary for
students to reach; 2) Evaluation: monitoring the student’s evolution and propose
changes in the proposals whenever necessary; 3) Methods: offering several learning
contexts, offering multiple types of learning resources and keeping the student moti-
vated and proactive in the task; 4) Content: they should be in accordance with learning
goals. According to the UDL principles, flexibility of curricula occurs through the
ability to provide [5]:
• Multiple modes of presentation: it can be reached by providing options for per-
ception, such as information display customization options, providing hearing and
visual alternatives; offering options for language and symbols, and giving options
for understanding, i.e., use strategies related to activating or providing background
knowledge; highlight interactions, essentials, main ideas and connections.
• Action and expression: use strategies to diversify response methods and path;
optimize access to tools and assistive technology; use strategies to diversify
response methods and path; optimize access to tools and assistive technology; offer
options for executive functions, such as: supporting development planning and
strategy; options that facilitate information and resource management.
• Engagement: Provide options to encourage student interest by maximizing rele-
vance, value and authenticity and minimizing insecurity and anxiety; provide
options to sustain effort and persistence, such as: options that vary levels of chal-
lenge and support; foster collaboration and a sense of community; provide self-
regulation options: use strategies to promote expectations and anticipations that
optimize motivation; develop self-assessment and reflection.
Furthermore, the use of technology is crucial to guarantee access to content, as well
as to allow students to be more independent and autonomous in academic tasks.
Technologies can reduce methodological barriers, providing the same curriculum for
all, but with personalized goals, methods, evaluation and content [15].
2.3 Accessibility Guidelines

Web accessibility barriers have made it difficult for people with disabilities to navigate
the Web. Concerned about these barriers, in 1997, W3C launched the Web Accessi-
bility Initiative (WAI). Implementing accessible web pages has been realized to benefit
not only disabled people, but other users, as well as devices such as mobile, which have
limited resources. This initiative drafted and published the WCAG 1.0 (Web Content
Accessibility Guidelines 1.0) in the late 90’s. In order to make web content accessible
to anyone, regardless of the device used (desktop, mobile, etc.), WCAG 1.0 defines
fourteen general guidelines or principles for an accessible project.
Each of the guidelines is associated with checkpoints that explain how it should be
applied, providing links to technical documents with examples for implementing such
points. Checkpoints are assigned priority levels, depending on the impact they may
have on accessibility [16]. Meeting the recommendations of each priority level
interferes with the level of compliance achieved by the website [17]. Priority levels are
numbered from 1 to 3, describing as required application requirements, otherwise it
will be impossible for one or more groups to access web content; as requirements that
should have in the application, otherwise some groups will have difficulty accessing the
content; and as requirements they might have in the application, so that it is easier for
some groups to gain access, respectively [16].
WCAG 1.0 was updated in 2008 resulting in the publication of WCAG 2.0 com-
plementing it and being designed to be widely applied to Web technologies. These
guidelines are divided into four main topics, which are: perceptible (information and
interface components must be presented so that users can capture them); operable
(interface and navigation components must be operable); understandable (information
and use of the interface must be understandable); robust (content must be robust to be
fully interpreted by a wide variety of users) [18].
The last W3C accessibility guidelines update took place in 2018, with the publi-
cation of WCAG 2.1, which also does not nullify WCAG. On the contrary it com-
plements it. The WCAG 2.1 goal to improve accessibility guidance for three major
groups: users with cognitive or learning disabilities, users with low vision, and users
with disabilities on mobile devices [19].
2.4 Related Work

When analyzing accessibility issues, it is worth referring to studies related to the design
of user-friendly interfaces and research results. American researchers testing Universal
Design for Learning (UDL) among people with disabilities using digital platforms
noticed that students with various learning deficits are much more involved in learning
new messages through UDL platforms. In addition, the same group found that the
overall results of the final test were higher with UDL than without such solutions [20].
However, when designing learning support systems for people with deficits, it is worth
taking into account the diversity of disabilities as well as the specific characteristics of
the learner, which allows to focus on the type of deficits [21].
The way in which a learner engages in a learning process is important to his or her
performance. The UDL overcomes the barriers of deficits and reinforces deep learning
[22]. It is interesting in this context to recall the results of systematic content analysis.
Researchers using this technique to process the available results have drawn some
interesting conclusions. Firstly, the availability of e-learning is an urgent need for
people with cognitive disabilities. So far, there are still too few analyses focused on
cognitive accessibility. Typically, research on e-learning and UDL is focused on
specific disorders/diseases rather than on the cognitive functions of the learner.
One of the factors forcing a change in the approach outlined above is the systemic
transformations resulting from the evolution of higher education. According to British
researchers, the application of UDL has a chance to increase the inclusion effectiveness
of dimensions of educational activities. UDL is also a way to implement the strategy of
excellence in many institutions dealing with adult education [23]. The metatheoretical
results of analyses related to the learning process show the UDL as an intelligent
strategy for the implementation of people with disabilities in the information society
[24]. However, practitioners designing platforms for people with disabilities draw
attention to several important criteria. First, language is becoming a key element.

Increasing the effectiveness of the digital learning environment requires the use of
transparent language (understood by people with different disabilities). This issue is
sometimes supported by several graphic solutions (diagrams, thought maps, pic-
tograms, etc.). An important element is the design of effective navigation to the plat-
form content. Overloading and underloading with photos or videos can also disrupt the
learning process. It is important for the UDL to include summaries and create shortcuts
of knowledge and skills [25].
The UDL is a concept with an extensive application. Currently, the literature on the
subject shows that the concept works well in various thematic areas that can be
combined with the increasingly popular VR and AR technologies [26]. Often, the
authors point to the possibility of using UDL in the design of applications, platforms
operating not only for a diverse group of students, but also for hardware diversity
(including popular mobile devices) [27].
Inclusion has always been one of the main OER premises. Although many dis-
cussions [28, 29] were centered in the social aspect of inclusion - since granting open
access to high-quality learning content would help to break some important socioe-
conomic barriers to the education, the case for OER as a key point for accessibility has
been discussed by many other authors, like [30]. The compliance to the openness
principles [31] is a core aspect to guarantee that accessible material is able to be
adapted, remixed, revised, repurposed and redistributed, according to the specific
learning requirements, especially those related to learning disabilities.
From the DUA, W3C guidelines, OER recommendations, features of people with
learning disabilities and the authors’ expertise in building accessible material, we
propose some recommendations for educators as authors of accessible OER.
A methodological cut was made and taken as target audience for the OER only that
people with some learning disability or barrier.
We then generated a list of recommendations for authors of OER for the audience
regarding the care they should take to create or use text, video, images, sounds and
other resources. This list of recommendations was created from studies and also from
the authors’ knowhow in the generation of accessible material. These resources should
be used with the support of an authoring tool for the generation of accessible teaching
courses.
4 Recommendations for Authors of Accessible OER

for People with Learning Disabilities
The concept of UDL is closely associated with the use of technology; however, UDL is
not just the use of technology in education [32]. It is also about pedagogical or
instructional practices used by students with or without disabilities.
Thus, to build accessible OER, we can think of two complementary scenarios: the
use of technologies to provide facilitators for students (such as screen readers, increase
the font size, calculators, speech recognition, speech synthesizers, etc.) and the
instructional and pedagogical practices that teachers should think about to meet the
conditions of their students. As a background, all aspects of openness brought by OER
must be considered in the authoring process [31].
The work presented by [33] already pointed to the technologies that could be used
to help people with learning difficulties. The author cites, for example, Word Pro-
cessing, Spell Checking, Proofreading Programs, Speech Recognition to minimize
Written Language problems; Speech Synthesis, Optical Character Recognition Systems
for Reading Problems; Personal Data Managers and Free-Form Databases for Orga-
nization and Memory Issues; Talking Calculators for Math Problems.
Table 1 presents a summary of the main difficulties presented by people with
learning disabilities and how this is minimized by technological and/or pedagogical
resources, based on [32, 33] and in the authors’ practice in developing accessible digital
material.
Table 1. Relationship between the difficulties presented by people with learning difficulties and
the computational resources. (source: authors)
Difficulties Technological and pedagogical resources
Reading Screen Reader, small and simpler text, use auxiliary vocabulary,
don’t use abbreviations
Writing Typing text, spell checker
Calculating Calculator, numeric ruler
Attention Use more than one media resource (image, video, text, sound), use
feedback frequently
Time planning Don’t use time in the activities or giving more time to do the
activities
Long or short-term Videos/images/links
memory
Organization Index of contents
Cognitive problems Use of alternative texts in images and links, use of simpler texts, use
of videos and other multimedia resources to complement the
understanding of texts, tips and glossary for less common words
Some of the features presented in Table 1 may be provided through digital tech-
nologies to be made available to students. However, there are strategies to be imple-
mented by the authors of teaching materials. In order to guide educators to build
accessible OER for people with learning disabilities, the following recommendations
were generated, divided into general content, non-textual content (video, image, ani-
mation, audio) and exercises/activities. These recommendations can be inserted in an
authoring tool for creating accessible digital material.
4.1 General Content
• Use student daily words. If you need to use unusual words, create a glossary with
the meaning of these words.
• Do not use color, sound, shapes as the sole resource for understanding content and
for feedback.
• Avoid using text in images unless they are essential (examples: trademarks and
logos) or can be customized by the user.
• Do not insert animation of more than 5 s if it is not essential.
• Create an index of content that will be displayed.
• Avoid using abbreviations.
• Maintain pattern of the objects that make up the material, such as titles, content,
feedbacks and image description.
• The contents (textual, video, sound, etc.) cannot be too long. This means the content
must be smaller than you would use for typical students. The content should be
more direct and clearer.
• If possible, create materials at different levels of depth. Use the shallowest level to
present the context and a deeper level, such as “read more”.
• Use images, graphics and videos to help in understanding the content.
4.2 Non-textual Content (Video, Image, Animation, Audio)
• Enter information about the meaning of that content, its purpose (what it is for), and
some accessible description of it.
• It is desirable that the video has subtitles in the same mother language as the readers.
• If the video has no subtitles, subtitle software can be used (such as Movavi Clips
(https://www.movavi.com/), Wave.video (https://wave.video/), InShot (https://
inshoteditor.br.uptodown.com/android), Clipomatic (https://www.apalon.com/
clipomatic.html).
• Images should not have many visual elements, not to confuse.
• Do not use too long sound-based information with too much different information.
4.3 Exercises/Activities
• Establish different difficulty levels. Start with the least complex exercises/activities.
• Give feedback on the response to exercises/activities.
• If the exercise/assessment has a time limit, the teacher may set extra time or disable
the use of time.
5 Conclusion
This paper has made recommendations for creating accessible OER for people with
learning disabilities. These recommendations were based on the W3C, Universal
Design for Learning guidelines, the openness principles and also on the authors’
empiric experience in preparing accessible educational resources. Rely on the

assumption that authors of educational resources will always make them in an acces-
sible, and open way is a big mistake. Teachers often are not aware about accessibility
nor openness, and they usually are not trained properly in using tools to create
accessible, open resources, nor even simple Web pages, for example. In this sense, this
set of good practices when designing accessible OER could be helpful as a reference
for this process. Further work points to the generation of recommendations for all types
of disabilities, such as dyslexia, motor disability, deafness or low hearing, among
others, as well as to the design of a computational artifact to help the accessible OER
design process.
Acknowledgment. This work was supported by the ERANET-LAC project which has received
funding from the European Union’s Seventh Framework Programme. Project Smart Ecosystem
for Learning and Inclusion – ERANet17/ICT-0076SELI. The work was also supported by the
Coordenação de Aperfeiçoamento de Pessoal de nível superior - Brazil (CAPES) - Programa de
Excelência - Proex 1133/2019 and Fundação de Amparo à Pesquisa do Estado de São Paulo
(FAPESP) 2018/04085-4.
References
1. Vygotsky, L.S., Luria, A.R., Leontiev, A.: Linguagem, desenvolvimento e aprendizagem
[Language, Development and Learning]. Ícone, São Paulo (1991)
2. National Joint Committee on Learning Disabilities: Operationalizing the NJCLD definition
of learning disabilities for ongoing assessment in schools. Learn. Disabil. Q. 21, 186–193
(1998)
3. Lagae, L.: Learning disabilities: definitions, epidemiology, diagnosis, and intervention
strategies. Pediatr. Clin. North Am. 55(6), 1259–1268 (2008)
4. National Center for Education Statistics (NCES). https://nces.ed.gov/programs/coe/
indicator_cgg.asp. Accessed 21 Nov 2019
5. Rose, D.: Universal design for learning. J. Spec. Educ. Technol. 15(3), 45–49 (2000)
6. Baldiris, N., Margarita, S., et al.: A technological infrastructure to create, publish and
recommend accessible open educational resources. Revista Observatório 4(3), 239–282
(2018)
7. National Collaborating Centre for Mental Health UK: Challenging behaviour and learning
disabilities: prevention and interventions for people with learning disabilities whose
behaviour challenges (2015)
8. American Psychiatric Association: Diagnostic and statistical manual of mental disorders.
BMC Med. 17, 133–137 (2013)
9. García, T., Rodríguez, C., et al.: Executive functioning in children and adolescents with
attention deficit hyperactivity disorder and reading disabilities. Int. J. Psychol. Psychol. Ther.
13(2), 179–194 (2013)
10. Moreau, D., Waldie, K.E.: Developmental learning disorders: from generic interventions to
individualized remediation. Front. Psychol. 6, 2053 (2016)
11. Davis, S., Laroche, S.: Mitogen-activated protein kinase/extracellular regulated kinase
signalling and memory stabilization: a review. Genes, Brain Behav. 5, 61–72 (2006)
12. Samuels, I.S., Saitta, S.C., Landreth, G.E.: MAP’ing CNS development and cognition: an
ERK some process. Neuron 61(2), 160–167 (2009)
13. González-Valenzuela, M.J., Soriano-Ferrer, M., Delgado-Ríos, M.: “How are reading
disabilities operationalized in Spain”. A study of practicing school psychologists. J. Child.
Dev. Disord. 2, 3 (2016)
14. Mace, R.L.: Universal design in housing. Assist. Technol. 10(1), 21–28 (1998)
15. Alnahdi, G.: Assistive technology in special education and the universal design for learning.
Turk. Online J. Educ. Technol.-TOJET 13(2), 18–23 (2014)
16. W3C: Web content accessibility guidelines 1.0 (1999). https://www.w3.org/TR/WAI-
WEBCONTENT/. Accessed 11 Oct 2019
17. Rocha, J.A.P., Duarte, A.B.S.: Diretrizes de acessibilidade web: um estudo comparativo
entre as WCAG 2.0 e o e-MAG 3.0. Inclusão Soc. 5(2), 75–78 (2012)
18. W3C: Web content accessibility guidelines 2.0 (2008). https://www.w3.org/TR/WCAG20/.
19. W3C: Web content accessibility guidelines 2.1 (2018). https://www.w3.org/TR/WCAG21//.
20. Hall, T.E., et al.: Addressing learning disabilities with UDL and technology: Strategic
reader. Learn. Disabil. Q. 38(2), 72–83 (2015)
21. Kuzmanovic, J., Labrovic, A.J., Nikodijevic, A.: Designing e-learning environment based on
student preferences: conjoint analysis approach. Int. J. Cognit. Res. Sci. Eng. Educ.
(IJCRSEE) 7(3), 37–47 (2019)
22. Hollingshead, A.: Designing engaging online environments: universal design for learning
principles. In: Cultivating Diverse Online Classrooms Through Effective Instructional
Design, pp. 280–298. IGI Global (2018)
23. Martin, N., et al.: Implementing inclusive teaching and learning in UK higher education–
Utilising Universal Design for Learning (UDL) as a route to excellence (2019)
24. Courtad, C.A.: Making your classroom smart: universal design for learning and technology.
In: Smart Education and e-Learning, pp. 501–510. Springer, Singapore (2019)
25. McKeown, C., McKeown, J.: Accessibility in online courses: understanding the deaf learner.
TechTrends 63, 506–513 (2019)
26. Menke, K., Beckmann, J., Weber, P.: Universal design for learning in augmented and virtual
reality trainings. In: Universal Access Through Inclusive Instructional Design: International
Perspectives on UDL, p. 294 (2019)
27. Armstrong, A.M., Franetovic, M.: UX and instructional design guidelines for m-learning. In:
Society for Information Technology & Teacher Education International Conference.
Association for the Advancement of Computing in Education (AACE) (2019)
28. Hockings, C., Brett, P., Terentjevs, M.: Making a difference—inclusive learning and
teaching in higher education through open educational resources. Distance Educ. 33(2), 237–
252 (2012)
29. Teixeira, A., et al.: Inclusive open educational practices: how the use and reuse of OER can
support virtual higher education for all. Eur. J. Open Distance E-Learn. 16(2) (2013). https://
www.eurodl.org/?p=special&sp=articles&inum=5&abstract=632&article=632
30. Navarrete, R., Luján-Mora, S.: Improving OER websites for learners with disabilities. In:
Proceedings of the 13th Web for All Conference. ACM (2016)
31. Silveira, I.F.: OER and MOOC: the need for openness. Issues Inf. Sci. Inf. Technol. 13, 209–
223 (2016)
32. King-Sears, M.: Universal design for learning: technology and pedagogy. Learn. Disabil. Q.
32(4), 199–201 (2009)
33. Raskind, M.H.: Assistive technology for adults with learning disabilities: a rationale for use.
In: Gerger, P.J., Reiff, H.B. (eds.) Learning Disabilities: Persisting Problems and Evolving
Issues, pp. 152–162. Andover Medical Publishers, Boston (1994)
Digital Storytelling and Blockchain
as Pedagogy and Technology to Support
the Development of an Inclusive Smart
Learning Ecosystem
Solomon Sunday Oyelere1(&), Ismar Frango Silveira2,

Valeria Farinazzo Martins2, Maria Amelia Eliseo2,
Özgür Yaşar Akyar3, Vladimir Costas Jauregui4, Bernardo Caussin4,
Regina Motz5, Jarkko Suhonen1, and Łukasz Tomczyk6
1
{solomon.oyelere,jarkko.suhonen}@uef.fi
2
Mackenzie Presbyterian University, São Paulo, Brazil
{ismar.silveira,valeria.farinazzo,
mariaamelia.eliseo}@mackenzie.br
3
[email protected]
4
5
Universidad de la República, Montevideo, Uruguay
[email protected]
6
Pedagogical University of Cracow, Kraków, Poland
[email protected]
Abstract. This study presents the work-in-progress implementation of a smart

learning ecosystem being developed to support learner centered pedagogy such
as digital storytelling and recent technologies such blockchain and microsites.
The implementation of the ecosystem follows the design science research
framework and the universal accessibility guidelines to provide the users with a
smart, accessible and responsive learning environment. Besides helping the
teacher to analyze students’ progress through the learning analytics component,
this ecosystem will help the student to access learning content irrespective of
their disabilities and other constraints.
Keywords: Smart learning ecosystem Digital storytelling Blockchain

Inclusion
1 Introduction
In recent times, there has been a massive interest to revamp the educational environ-
ment to be open, accessible, trustworthy, and meeting the expectations of all stake-
holders, including teachers, students, parents, regions, and governments. These
growing requests led to the birth of a joint project, Smart Ecosystem for Learning and
Inclusion (SELI, seliproject.org) supported by the European Union, Latin America and
Caribbean [1, 16]. SELI addresses the crucial gap of 21st century educational goals
through the design science research framework [2] by identifying the needs and
requirements of different regions; outline the learning ecosystem and define the
requirements; design and develop the ecosystem; and finally, validate and evaluate the
solution. Emerging pedagogies, methods, strategies and technologies that are capable
of supporting the seamless implementation of the learning ecosystem were identified
and developed according to the universal accessibility standard [3–5]. The main aspects
of the SELI ecosystem includes: authoring services, microsites (a small cluster of web
pages that presents the course with all didactic contents, independent of the authoring
tool), learning management system (LMS) and content management system
(CMS) services, digital storytelling pedagogy, learning analytics services, and block-
chain support. An ecosystem can be defined as “a community of organisms in con-
junction with environmental components interacting as a (semi-) closed system” [6].
Briscoe and DeWalde [7] define a digital ecosystem as “an artificial system that aims to
harness the dynamics that underlie the complex and diverse adaptations of living
organisms in biological ecosystems”. Boley and Chang [8], brought the following
definition: “an open, loosely coupled, domain clustered, demand-driven, self-
organizing agent environment, where each agent of each species is proactive and
responsive regarding its own benefit/profit but is also responsible to its system.” Many
researchers have been developing digital ecosystems-based solutions to address dif-
ferent problems in our society. As an example, Mendoza et al. [9] developed a digital
ecosystem for the digital literacy gap. Before them, Silveira et al. [6] proposed LATIn,
a digital ecosystem for open textbooks in the context of Latin American Higher
Education. More recently, Burns and Dolan [10] proposed a set of policies, platforms,
and systems as an ecosystem to help to include people as participants of the so-called
“digital economy”. However, SELI ecosystem addresses the problem of inclusive
education using new technologies and pedagogy.
2 Blockchain Technology and Digital Storytelling

for Inclusion
2.1 Blockchain Technology from an Inclusive Perspective

Inclusion is defined as the degree to which an employee perceives that he or she is a
valued member of the work group or educational community. It is important to discern
that inclusion is not autonomous from belonging, but that both are key elements in
company initiatives and in a similar way in learning, where learning is perceived as a
collaborative process. Collaborative learning is based on several psychological cur-
rents; among them, Vigotsky’s sociocultural theory, which conceives man as an entity
product of social and cultural processes. Belonging from the employee point of view is,
“I can be authentic, I matter, and am essential to my team.” Learning group diversity is
a well-researched topic, where more diverse in the learning group, the most learning is
Digital Storytelling and Blockchain as Pedagogy and Technology 399
achieved. This means that, when we think of an inclusive learning environment, we are
intrinsically thinking of a learning environment designed for diversity. According to
Bourk et al. [19] traditional diversity is defined by gender, race, nationality, age, and
demographic differences, but from a new perspective, diversity is defined in a broader
context, including concepts of “diversity of thought” also addressing people with
autism and other cognitive differences.
In a collaborative diverse environment, inclusion can then be defined as: Individual
is treated as an insider and also allowed/encouraged to retain uniqueness within the
work group. Inside SELI Project, we see the learning environment as an “ecosystem”.
In other words, ecosystem is the union of individuals or services with an environment
where different interactions occur. From a technical perspective, Blockchain is the
platform that gives support in order that these interactions occur in a transparent and
secure way. From a social perspective, Blockchain is the environment that allows
inclusion preserving the individuality (without intermediates). Therefore inside SELI
project, using blockchain as an inclusive digital ecosystem is seen from two perspec-
tives [11]: (1) from an infrastructure perspective: Blockchain is a useful tool to support
the ecosystem of services (for example, content authoring tools services, LMS, CMS,
recommendation services, learning analytics services). Blockchain provides a dis-
tributed platform, with transactions between these services with secure identifications.
This is the more traditional use of Blockchain, as a secure environment for transactions.
There are several projects that follow this line in education, where Blockchain is used
for certificate issuance for example. (2) from the social perspective of inclusion:
Blockchain democratizes the education, gives possibilities, voice and value from each
student and teacher. Our contribution in this direction is to encourage the use of
Blockchain through giving support to storytelling as a tool for social interaction. Perret-
Clermont [20], based on Piaget’s work, focused on fluence of social interactions for
cognitive development, with the assumption that learning takes place within each other,
but it is dependent on social exchanges, and assigns interactions a major role in the
cognitive development of the subject. We are of the opinion that digital learning
ecosystems must tear down the boundaries of current education, one of them being the
physical limits that are imposed on possible interactions. In this sense, a distributed
environment like Blockchain can be a great solution.
2.2 Digital Storytelling as an Active Pedagogy in Inclusive Education

As we discussed in our previous paper our dialogue among scholars coming from
diverse disciplines brings front the use of workshop-based digital storytelling rather
than the tool based digital storytelling [12]. Creating learning environment and habits
of co-creative processes gains importance for both students and educators in the SELI
ecosystem. We aim to provide this holistic approach throughout implementing 6 fol-
lowing phases of a well-known workshop-based digital storytelling defined by Lampert
[13], which were originally designed as face to face workshops (Fig. 1).
Fig. 1. The 6 phases of workshop-based digital storytelling by Lampert [13].
We use workshop based digital storytelling for enhancing teacher education with a
inclusion in mind as the process provide opportunities for teachers to make reflections
on their practices on handling diversity in the classroom. SELI team support teacher’s
professional development through workshops by allowing the teachers to tell their
stories based on their experiences with those that have similar interest in working with
the disadvantaged groups. SELI ecosystem allows to create a community of practice
with an area of shared interest in inclusion, relationships built through discussions as
well as stories of their practices. SELI ecosystem also provides storytelling tool for the
use of students. This allows teachers to transfer their experience in workshop-based
digital storytelling into their classroom.
3 Design Science Research Methodology
The design science research addresses real-world problems in a holistic and innovative
ways. According to Johannesson and Perjons [2], the design science framework fol-
lows a feedback-loop process including problem explication, outline artifact and define
requirements, design and develop artifact, validate artifact and evaluate artifact (Fig. 2).
SELI’s design and creation of learning ecosystem started by bringing together diverse
stakeholders from EU, and LAC to explicate the challenges of digital exclusion and the
inaccessibility of education for disadvantaged groups. As part of the requirement
definition, SELI discovered the needs and requirements of implementing and inte-
grating emerging pedagogies, methods and technologies such as blockchain, global
sharing pedagogy, digital storytelling, flipped learning, and educational games, through
workshops, and focus group sessions with stakeholders and target groups across the
regions. The design and development of the smart learning ecosystem follows inte-
grative process of agile, open and co-design approaches, in which researchers, software
developers, students and business experts collaborate through several online meetings.
At the moment, we are on the design science research phase of validating the learning
ecosystem through workshops with teachers in different forums such as conferences,
seminars, and other strategic events.
Fig. 2. Design science research framework
4 Smart Learning Ecosystem
SELI ecosystem involves a solution framework to improve the teaching-learning

process. As shown in Fig. 3, it is divided into four views: service bus, concept, sup-
porting infrastructure and philosophical foundations. Service bus are the general ser-
vices as authoring, CMS, LMS, digital storytelling, collaboration, among others to
support teachers and students in building and consuming learning material. The con-
cept view is about the open licenses to ensure community access to content. The
supporting infrastructure are the tools to support the educational innovative tech-
nologies such as blockchain to aid the global sharing of pedagogy, microsites to ensure
accessible content and analytics to inform the learning process. And, finally, the
philosophical foundations are the theoretical background about emerging pedagogies,
methods and technologies that permeates the SELI project. Some methodological
practices in education presuppose a collaborative perspective to enrich the process of
teaching and learning. Technological pedagogical supports strategies capable of
mobilizing teachers and students in different times and spaces, in a collaborative
perspective such as shared pedagogy. Universal design sets principles for creating
products, services and spaces that can be used by people, regardless of age, size, skills
or disability. The idea is not to create specific items but to include both those with a
disability and those who do not have them. In this context the Universal Design for
Learning (UDL) suggests the access and guarantee of learning to all students in the
school context, from offering multiple and varied ways of organizing and making
available scientific knowledge [16]. Universal design is an element of accessibility
behind the services offered by the service bus view.
Fig. 3. Smart learning ecosystem
In the following sections we present the main aspects of the SELI ecosystem.
4.1 Authoring Services and Microsites

The authoring service offer resources to the teacher to support creating and adapting
media for the construction of digital didactic material (with or without accessibility). In
addition to the teaching material, the authoring tool aids the construction of accessible
material for specific disabilities and finally she/he can verify the rate of accessibility for
a specific disability of the didactic content built. It allows teachers to create lesson
strategies that can be used according to specific declines of students with disabilities,
for example. It is possible also choose instructional design to facilitate and guide the
pedagogical strategy. The teacher is able to insert different media such as text, images,
links, videos, audios, created by him/her or downloaded from the internet with open
access copyright, for example. Each of these media can meet the accessibility criteria
set by the literature and the W3C [14]. So they may insert accessibility aspects such as:
descriptive text, audio description, sign language, etc. to improve the inclusion of
digitally disadvantaged groups. When the teacher creates a course and publishes it, it is
done through microsites that are linked to the CMS. This gives the student access to
this course. The microsite is a web page that will present the course proposed by the
teacher with all didactic content, independent of the authoring tool and should execute
in the same way in different architecture and should be able to display correctly in
diverse devices. This will provide the student with content presentation like selected
matter for the class (previously selected text readings, video lessons and/or podcast)
about concepts that will be learned. In addition, it may provide activities that explore
skills acquisition to make the student verify how much he/she understood about the
subjects presented so far, like practical activities. Seizing the dynamic features offered
by microsites the content presented will encourage discussion and collaboration
between the student through some collaborative tool. With the microsite, the teacher
can verify the acquired skills allowing a student to present the ability for example,
asking the students make a video with a storytelling about the concepts learned.
4.2 Learning Management System (LMS) and Content Management

System (CMS) Services
The CMS service is used to support the student to gain access to different courses, at
different learning levels, in different languages, created through the authoring service
and made available by teachers connected to the ecosystem in different countries and
cultures. The LMS service allows the teacher not only to offer courses, but also to
provide exercises, activities of different types, such as quizzes and storytelling, as well
as tests and assessments. The teacher can also track the student performance through
learning analytics component.
4.3 Digital Storytelling Service

Although there are several ways of creating digital storytelling, SELI ecosystem
encourage users to create first-person narratives based on their experiences by com-
bining their images with recorded voice. We implemented each of the 6 phases of the
workshop-based digital storytelling framework [13], supported within the SELI
ecosystem as an innovative solution. The digital storytelling service in the SELI
ecosystem provides a simple looking story flow where the user can add different scenes
and change the order of the scenes as shown in Fig. 4. Users are given the option to
choose to make their stories public. Since some of the stories may involve sensitive
content asking for consent of the user is very essential for ethical concerns. For each
scene users can write short description, upload an image and record voice over image
as shown in Fig. 5. Combining with this audio and visual elements user is allowed to
preview and share story in social networks and classroom activities.
Fig. 4. Flow of digital story on the SELI ecosystem
Fig. 5. Features of digital storytelling scene
4.4 Learning Analytics

The main goal of analytics is to support students in the learning process, moreover, in
the SELI ecosystem, the learner and the teacher are living entities interacting. Thus, the
teacher requires help to support the learner in the learning path. The tool will help in the
prediction of risks in the learning process and give suggestions to improve learning.
The process, according to the framework proposed by Chatti in [15], takes three stages:
data collection and preprocessing; analytics and action; post-processing. These three
stages iterate over time.
During the first stage, the first task is to identify the roles targeted by the analysis,
followed by the identification of the indicators for monitoring to evaluate the learning
process, assessment of the effectiveness of the learning process, and provide feedback
to teachers and students related to the learning process. In the SELI ecosystem, the first
stage is the one in process. In this stage the following roles are the target for analysis:
students and teachers.
In Fig. 6, we present the general architecture view of the learning analytics com-
ponent. The data is collected from the Service Bus view components. These components
follow the microsite infrastructure, where events are trapped to feed the Ecosystem
memory with raw event data related to each indicator. The Ecosystem Memory concept
is the databases across the Ecosystem services and tools. The event capture is a
requirement to be implemented inside each service in order to feed the memory with
data related to user events and behavior detected for each indicator. The capturer is a
Javascript event handler in the client-side; it follows the W3C standards. The data
collector interface will gather all data sent by the service component side (Tool) and feed
the corresponding part of ecosystem memory. The ecosystem memory is not as simple
as depicted in the Fig. 6; it is evolving during the development and testing process (the
current stage of the project). Our memory repository is MongoDB databases and File
System but is open to other technologies like PostgreSQL in the future. The ETL
implementation is with ToroDB. It produces a database living in PostgreSQL.
After ETL guided by ToroDB, the SELI team cleans the data manually, however, we are
working on the automatization of this task with scripts. The automatization requires
maturity in understanding the raw data gathered and the way ToroDB performs the
PostgreSQL database. The techniques for analysis will be statistics and information
visualization. Techniques related to classification and clustering will be discussed and
implemented in the future when the Ecosystem get a large amount of data.
Fig. 6. Learning analytics infrastructure
5 Discussion and Conclusion
New media have been permanently integrated into the learning and teaching process.
However, this simple statement has many important implications. The transformation
concerns mainly opportunities related to increasing the effectiveness of learning and
social inclusion [17]. Undoubtedly, new technologies make it possible to cross many
borders. These are not only territorial restrictions, but also those resulting from dis-
ability or belonging to disadvantaged groups. Pedagogy as a science on educational
ideals may currently use the potential of new technologies, thus developing the highest
objectives related to social inclusion. Such an example of ideal synergy between social
sciences and new technologies is the SELI platform.
The digital storytelling and flipped learning used in the SELI ecosystem shows the
possibility of symmetry in the transfer of knowledge and skills. The openness of the
platform creates an opportunity to combine fragmented content, both from professional
sources and from sources not representing the higher education. Reverse learning is
also the use of activation methods using new technologies that allow effective inter-
action with the use of everyday content, classical didactic methods transferred into
digital space to exchange experiences. This is especially important when we consider
the fact that participation in SELI brings together people with different cultural, and
organizational experiences. Therefore, in the text, the authors repeatedly refer to the
concept of “smart”. This keyword shows the flexibility of education, which is mani-
fested by openness (not only to integrate different contents into the whole), but also the
lack of borders for people with disabilities, and the possibilities offered by the mix of
technology and pedagogy.
The SELI platform is a learning environment that exploits the potential of fast data
collection and transfer. The pedagogy of sharing has its own exemplification also in the
dimension of effective use of digital storytelling. This inconspicuous technique, which
is rarely used, has an extraordinary potential. The SELI platform has implemented the
possibility of collecting valuable research and teaching material for almost every
course, referring to the sharing of experiences of the learning platform users. Based on
the collected stories relating to courses such as the prevention of cyberbullying or
preparation for being an educator of excluded people, a powerful database of cases is
built up [18]. Based on the experiences and stories of cyberbullying, it is possible to
redefine the content of online courses or to use archived cases (in the form of digital
written stories or recordings) to learn from other people’s biographies. Besides, digital
inclusion cases (e.g. didactic failures of trainers) provide an opportunity to combine
digital storytelling with the reversed classroom method.
The presented SELI ecosystem has several important perspectives. It is a per-
spective of knowledge, skills and biographical experience transfer between selected
European and Latin American countries. SELI ecosystem also creates the possibility to
quickly connect and refer to distributed data, to authenticate the effects of didactic
activities (certification through blockchain). The wisdom of the described solution is
primarily broadly understood inclusiveness, i.e. inclusion regardless of physical, lin-
guistic, state or age restrictions. Within SELI ecosystem, there is also a perspective of
scientific research, didactic activities, exchange of experiences, transfer of values and,
above all, construction of wise solutions, i.e. allowing to keep up with the universal
needs of learning subjects.
received funding from the European Union’s Seventh Framework Programme. Project Smart
Ecosystem for Learning and Inclusion - ERANet17/ICT-0076SELI, including funding from
FAPESP, 2018/04085-4.
References
1. Martins, V., Oyelere, S.S., Tomczyk, L., Barros, G., Akyar, O., Eliseo, M.A., Amato, C.A.
H., Silveira, I.F.: A blockchain microsites-based ecosystem for learning and inclusion. In:
Brazilian Symposium on Computers in Education (Simpósio Brasileiro de Informática na
Educação-SBIE), Brazil, pp. 229–238 (2019)
2. Johannesson, P., Perjons, E.: A Design Science Primer. Springer, Heidelberg (2014)
3. Martins, V.F., Amato, C.A.H., Eliseo, M.A., Silva, C., Herscovici, M.C., Oyelere, S.S.,
Silveira, I.F.: Accessibility recommendations for creating digital learning material for
elderly. In: 2019 XIV Latin American Conference on Learning Objects and Technology
(LACLO). IEEE (2019)
4. Martins, V.F., Amato, C.A.H., Ribeiro, G.R., Eliseo, M.A.: Desenvolvimento de Aplicações
Acessíveis no Contexto de Sala de Aula da Disciplina de Interação Humano-Computador.
Revista Ibérica de Sistemas e Tecnologias de Informação E17, 729–741 (2019)
5. Martins, V.F., Amato, Souza, A.G., Sette, G.A., Ribeiro, G.R., Amato, C.A.H.: Material
digital Acessível Adaptado a partir de um Livro Didático Físico: Relato de Experiência,
Revista Ibérica de Sistemas e Tecnologias de Informação (2020, in Press)
6. Silveira, I.F., Ochoa, X., Cuadros-Vargas, A.J., Casas, A.H.P., Casali, A., Ortega, A.,
Sprock, A.S., Silva, C.H.A., Ordoez, C.A.C., Deco, C., Cuadros-Vargas, E., Knihs, E.,
Parra, G., Muoz-Arteaga, J., Santos, J.G., Broisin, J., Omar, N., Motz, R., Rods, V.,
Bieliuskas, Y.C.H.: A digital ecosystem for the collaborative production of open textbooks:
the LATIn methodology. J. Inf. Technol. Educ.: Res. 12(1), 225–249 (2013)
7. Briscoe, G., De Wilde, P.: Digital ecosystems: evolving service-orientated architectures. In:
Proceedings of BIONETICS 2006 the 1st International Conference on Bio Inspired Models
formation and Computing Systems. ACM, New York (2006)
8. Boley, H., Chang, E.: Digital ecosystems: principles and semantics. In: Proceedings of the
2007 Inaugural IEEE Conference on Digital Ecosystems and Technologies, pp. 1–6 (2007)
9. Mendoza, J.E.G., Arteaga, J.M., Rodriguez, F.J.A.: An architecture oriented to digital
literacy services: an ecosystem approach. IEEE Lat. Am. Trans. 14(5), 2355–2364 (2016)
10. Burns, C., Dolan, J.: Building a foundation for digital inclusion: a coordinated local content
ecosystem. Innov.: Technol. Gov. Global. 9(3–4), 33–42 (2014)
11. Oyelere, S.S., Tomczyk, L., Bouali, N., Agbo, F.J.: Blockchain technology and gamification
– conditions and opportunities for education. In: Veteška, J. (ed.) Adult Education 2018 –
Transformation in the Era of Digitization and Artificial Intelligence. Andragogy Society,
Prague, ISBN 978-80-906894-4-2 (2019)
adult education and school pedagogy. In: Veteška, J. (ed.) Adult Education 2018 –
Transformation in the Era of Digitization and Artificial Intelligence. Czech Andragogy
Society, Prague, ISBN 978-80-906894-4-2 (2019)
13. Lambert, J.: Digital Storytelling: Capturing Lives, Creating Community. Routledge,
Abingdon (2013)
14. W3C: Web content accessibility guidelines 2.1 (2018). https://www.w3.org/TR/WCAG21//.
15. Chatti, M.A., Dyckhoff, A.L., Schroeder, U., Thüs, H.: A reference model for learning
analytics. Int. J. Technol. Enhanced Learn. (IJTEL) 4(5/6), 318–331 (2012)
16. CAST: Universal design universal for learning. http://www.cast.org. Accessed 20 Dec 2019
17. Tomczyk, Ł., Eliseo, M.A., Costas, V., Sánchez, G., Silveira, I.F., Barros, M.J., Amado-
Salvatierra, H.R., Oyelere, S.S.: Digital divide in Latin America and Europe: main
characteristics in selected countries. In: 14th Iberian Conference on Information Systems and
Technologies (CISTI), pp. 1–6. IEEE (2019)
18. Tomczyk, Ł., Włoch, A.: Cyberbullying in the light of challenges of school-based
prevention. Int. J. Cognit. Res. Sci. Eng. Educ. (IJCRSEE) 7(3), 13–26 (2019)
19. Bourke, J., Garr, S., Berkel, A., Wong, J.: Diversity and inclusion: the reality gap-2017
Global Human Capital Trends (2017)
20. Perret-Clermont, A.-N., et al.: La construction de l'intelligence dans l'interaction sociale
(1996)
Aggregation Bias: A Proposal to Raise
Awareness Regarding Inclusion
in Visual Analytics
Andrea Vázquez-Ingelmo1(&) , Francisco J. García-Peñalvo1 ,

and Roberto Therón1,2
1
GRIAL Research Group, Computer Sciences Department,
Research Institute for Educational Sciences, University of Salamanca,
Salamanca, Spain
{andreavazquez,fgarcia,theron}@usal.es
2
VisUSAL Research Group, University of Salamanca, Salamanca, Spain
Abstract. Data is a powerful tool to make informed decisions. They can be

used to design products, to segment the market, and to design policies. How-
ever, trusting so much in data can have its drawbacks. Sometimes a set of
indicators can conceal the reality behind them, leading to biased decisions that
could be very harmful to underrepresented individuals, for example. It is chal-
lenging to ensure unbiased decision-making processes because people have their
own beliefs and characteristics and be unaware of them. However, visual tools
can assist decision-making processes and raise awareness regarding potential
data issues. This work describes a proposal to fight biases related to aggregated
data by detecting issues during visual analysis and highlighting them, trying to
avoid drawing inaccurate conclusions.
Keywords: Data bias Information visualization Data visualization

Inclusion awareness
1 Introduction
Information has grown in size and relevance over the last years; technology has not
only increased the generation of data but also their accessibility. People with an Internet
connection can consult a wide range of datasets about almost any topic: crime data,
healthcare data, weather data, financial data, etc.
These data can be employed to make informed decisions regarding different
domains. For example, businesses can employ demographic data to create personalized
advertisements or to segment the market. Governments can employ their data to design
new policies. Any person regularly uses data to make informed decisions. A simple
question like “should I get a coat to go out today?” can be answered through data
(made available by weather services) to make an informed decision that, in the end,
seeks some kind of benefit (in this case, the benefit of avoiding hypothermia).
However, delegating decisions solely in data might turn out to be a two-edged
sword. Data not only can be wrong or false, but it can also be incomplete, and making
410 A. Vázquez-Ingelmo et al.
decisions using wrong data leads to wrong decisions. There are several cases in which
relying on the wrong data has provoked undesired results, mostly because of data bias
or even algorithmic bias [1–3].
So it seems clear that if the data that you are using to make decisions is not the best
for your problem, you could end up with decisions that are also not the best for your
problem. But how can people avoid such inconveniences with data? Bias is generally
introduced unconsciously, and it can be hard to detect our own biases and be aware of
them while collecting data. For these reasons, data should be thoroughly examined to
identify gaps or inconsistencies before using them in decision-making processes.
One of the most used methods to ease the analysis and exploration of datasets is visual
analytics [4, 5]; using information visualizations, users can interact and explore datasets
through visual marks that encode certain information [6]. However, visualizations could
hide data issues by lifting the attention from the analysis process carried out on the raw
data to the discovered patterns. Patterns can be seen as shortcuts that tell us properties
about the data, for example, if there are correlations among the visualized variables [7].
But visual analysis shouldn’t be reduced to just the identification of patterns and to trust
them blindly, because patterns can likewise lead to wrong conclusions [8].
This work describes a proposal for raising awareness during visual analysis,
helping users to make informed decisions taking into account the flaws or potential
issues of their datasets. Specifically, issues related to data aggregation, which can be
very harmful in data-driven decision-making processes. The main goal is not only to
improve decision-making, but to address inclusion problems when dealing with data, as
data biases can lead to decisions that (involuntarily, or not) discriminate individuals.
The rest of this paper is organized as follows. Section 2 introduces some issues
related to data analysis and data aggregation. Section 3 describes the methodology
followed to design the proposal. Section 4 presents a proposal to raise awareness
during visual data analysis. Section 5 discusses the proposal, following by Sect. 6, in
which the conclusions derived from this work are outlined.
2 Background
The outcomes of decision-making processes are actions that affect the context in which
decisions are being made. When deciding which action to take, the decision-maker will
have an assumption on how the action’s effect will affect the context, looking for a
benefit or a pursued result. However, the critical fact is that assumptions can be very
personal and could vary depending on the person’s beliefs, background, domain
knowledge, etc.
Even when the decision-maker support its decisions on data (embracing data-driven
decision-making [9]), there are still problems. As introduced before, data is not the holy
grail of decision-making, because as well as personal traits can influence the decision-
maker, the collected data and performed analyses can be influenced by other harmful
factors like data biases [10] or poor analysis.
There are specific fields of study, like uncertainty visualization, that try to find
methods to visualize uncertain data, thus warning users regarding the uncertain nature
of the results they are consuming through their displays [11, 12]. However, uncertainty
Aggregation Bias: A Proposal to Raise Awareness Regarding Inclusion in Visual 411
visualization is complex, and several concepts could be difficult to understand by non-

technical or non-statistical audiences, such as probabilities or densities, resulting in
users ignoring or misinterpreting uncertainty [13].
On the other hand, the data that is being visualized can present issues that could be
concealed and not considered through information visualizations, like excessive (or not
appropriate) aggregation levels, which could result in wrong conclusions.
Summary statistics summarize a set of observations through a collection of values
that simplify the comprehension of the datasets. But this simplification comes with a
price; while performing these summaries, a lot of information can be lost. One of the
most famous examples of this drawback is Anscombe’s quartet [14], in which different
datasets that tell very different stories have the same mean and variance. Anscombe
highlighted the usefulness of graphics [14] to avoid these issues (Fig. 1).
Fig. 1. The Anscombe’s Quartet. The four datasets have the same mean and variance values on
both variables represented on the X and Y axes.
So aggregated data ease the analysis process, but they can lead to a loss of
information. Aggregated data can also be vulnerable to phenomena like the ecological
fallacy and the Simpson’s paradox [15].
Inferring individual behavior by using aggregated data is a common extrapolation
mistake, where analysts might conclude that the behavior of a group is also accurate to
explain the behavior of the individuals within that group [16, 17].
Simpson’s paradox is also related to the data aggregation-level. In this case, there
might exist lurking variables that could entirely “change” the conclusions derived from
aggregated data [18, 19].
These aggregation-related issues can be very harmful if not taken into account [20],
especially if the audience is biased or not statistically-trained (or both).
Some works have tried to address these aggregation drawbacks through detection
algorithms [21, 22], but a few tried to address them during visual exploration [23].
3 Methodology
The proposal focuses on how to draw attention to potential aggregation biases and
fallacies during visual analysis. A simple workflow has been considered to automati-
cally seek for aggregation issues regarding the data being presented to the user.
Specifically, issues involving the Simpson’s paradox and underrepresentation of
categories.
Each categorical variable is considered as a potentially influencing variable. Of
course, as it will be discussed, this methodology is limited to the available variables
within the dataset. If the whole dataset has a small set of categories, the results would
not be as useful as it could be with a richer dataset.
The workflow follows a naïve approach to detect Simpson’s paradoxes [23]:
1. Every possible grouping at any possible level is computed on categorical to obtain a
set of potential disaggregation variables.
2. When the user visualizes data, the current aggregation level is retrieved (i.e., the
categorical columns used to group the data)
3. These data are then grouped by the variables identified in the first step.
4. The results of the performed disaggregation are sorted and compared with the
original scenario (i.e., the aggregated data values) trend.
5. If the disaggregation results differ from the originally aggregated results (a threshold
can be defined to specify which proportion of values need differ from the original
trend to consider the paradox), the Simpson’s paradox is considered for the dis-
aggregated attributes
However, even visualizing the disaggregated data by the identified attributes in the
fifth step, there could still be aggregation issues if data are in turn aggregated by a
function such as the mean, mode, ratios, etc. These functions can, in turn, distort the
reality of data.
To avoid relying on aggregation functions, when the detected Simpson’s paradoxes
are inspected, a sunburst diagram complements the display to give information about
the raw data sample sizes regarding the disaggregated values.
Sunburst diagrams are usually employed to represent hierarchies; in this context,
they are useful to display how the number of observations of the variable being
inspected varies its size among the different disaggregation levels.
The primary purpose is to have another perspective of data, drawing attention over
potential underrepresentation or overrepresentation in datasets.
4 Proposal
A simple proof-of-concept has been developed to illustrate the proposal. The employed
test data is from one of the most famous cases involving Simpson’s paradox: the
student admission at UC Berkeley in 1975 [24]. This dataset holds the following
information about each student: gender, the department in which the application was
issued, and the result of the application (admitted or rejected).
If the gender variable aggregates this data, the results yield a significant gender bias
against women: only 35% of women were admitted, in contrast with the 44% of
admitted males. This data could help the decision-makers to design new policies trying
to address the discovered gender bias.
However, this high-level aggregation hides some parts of the picture. If data is, in
turn, disaggregated using the department in which the application was issued, we see a
different scenario: the majority of the departments shown higher admission rates for
women than men. What was happening is that women applied to more competitive
departments than men, who issued the majority of applications to departments with a
high rate of admissions (resulting in higher admissions rates among male students).
This case is a famous example of Simpson’s paradox, but misleading conclusions
can be present in any context if these potential issues in data analysis are not accounted
for. For this reason, the interface presented in Fig. 2 is proposed.
Fig. 2. Interface proposal for detecting aggregation issues.
When the user is exploring her dataset, Simpson’s paradox detector starts searching
for potentially influential groupings that change the trend of the currently displayed
variables. If any grouping changes the trend, the categorical variables identified are
displayed (top section of Fig. 2).
The user then can click on each detected grouping to explore how the disaggre-
gation affects the value that she was examining, in addition to a sunburst diagram that
shows the distribution of occurrences of each observation under the selected grouping
(bottom section of the Fig. 2). In this specific example, the user can observe how
women apply less to departments with high admission rates (like department A, for
example) and issue more applications to more competitive departments, obtaining a
complete view of the examined data.
5 Discussion
Aggregating data is useful to summarize observations, but it can overlook crucial

aspects of data, like, for example, underrepresentation or overrepresentation of the
samples. Raising attention over this matter is essential, especially when studying
behavioral data or data that involve human beings.
How can this approach benefit decision-makers regarding inclusion-related issues?
Our biases could blind ourselves and make us not prone to ask skeptical questions
about the analyzed data. If data confirms something we believe, we might trust the
results without carrying out further analyses [25].
This approach forces analysts (or any kind of audience) to have a more in-depth
look at aggregated data, which sometimes can conceal underlying patterns or trends.
Having a deeper look is crucial when analyzing data for inclusion-related research
contexts because it is possible to visualize if aggregated results are due to the over-
representation of certain categories and to identify if any category is missing or not
represented at all.
Not considering aggregation issues can strengthen the belief that “one size fits all”,
which can lead to (involuntary) discrimination. If you design a product (referring to an
object, an algorithm, a policy, a treatment, etc.) for “people” and you use data that only
represent a particular portion of people or don’t bring attention to their differing
characteristics, you end up with a personalized product for a segment. There is nothing
wrong with personalized products; what is wrong is to think that this unconsciously
personalized product is universal and should fit every individual.
The underrepresentation of certain categories depends, of course, on the data
context. For example, in the Berkeley dataset, the underrepresentation of women’s
applications to some departments is due to the preference of the students to apply to
specific departments. But there are other cases in which the underrepresentation is due
to selection bias or a not representative sampling of the population. It is essential to
take this into account to avoid data bias against minorities (or even against non-
minorities, like women [20]).
A proposal for visually identifying aggregation issues (especially those related to
the Simpson’s paradox) has been developed. Of course, this proposal does not try at all
to replace statistical methods but to deliver a visual tool to understand better our
datasets.
The proposal has been focused on raising awareness regarding how disaggregating
data could change the patterns identified during the analysis of aggregated data. It also
could be used as an informative tool to educate people through a friendly interface
regarding the underlying issues of data aggregation and their dangerous effects on
decision-making processes.
Educating people in data skepticism and regarding potential biases is important
because data visualizations can be very persuasive and could influence people’s beliefs.
Relying on data visualizations tools to raise awareness can be powerful due to the
possibility of presenting information in understandable manners and also to the pos-
sibility of enabling individuals to freely interact with data [26, 27].
The methodology seeks for sub-groups that “change” the original scenario (i.e., the
trends identified on aggregated data). It is important to mention that, in this case,
statistical significance has not been considered because the main goal was to draw
attention to changes in visual patterns, no matter how small. However, complementing
this methodology with the computation of statistical significance could be more
powerful in some contexts [23].
Statistically-trained audiences might be aware of these issues. However, other
audiences could reach wrong insights about data if attention is not raised regarding
potential issues, thus distorting the decision-making process without even notice.
For example, when dealing with policies that affect individuals, it is crucial to rely
on disaggregated data to avoid ignoring the necessities of minorities [28–30].
But when talking about disaggregated data, there are some limitations to take into
account. Demographic variables are meaningful for inclusion-related research contexts,
but also sensitive. Some of these variables can be difficult to collect because of privacy
policies or privacy concerns.
In fact, for some activities as for example, hiring people, having such data available
could introduce the risk of biasing the decisions made during some phases of the
process [31, 32]. So analysts and decision-makers must understand the level of analysis
and goals to anonymize or omit these attributes accordingly.
To sum up, it is important to foster critical thinking and some skepticism toward
data. When dealing with information about individuals, accounting for data gaps is a
responsibility, because the decisions made could have a high impact in the context of
application, and sometimes, this impact is not beneficial for everyone.
6 Conclusions
This work presents a proposal for raising awareness in decision-making processes

through visual analysis. Relying on inappropriate data could lead to wrong decisions.
But identifying flaws in data is not a trivial task; bias, beliefs, and uncertainty can show
up both at data collection time and analysis time, resulting in distorted insights.
Through the detection of existing Simpson’s Paradox and the disaggregation of the
displayed data, the presented proposal tries to draw attention to issues like excessive or
inappropriate aggregation levels and potential overrepresentation or underrepresenta-
tion of data attributes or categories.
Future work will involve the evaluation and refinement of the proposal to improve
its effectiveness to obtain a tool to raise awareness about inclusion in different fields.
Acknowledgments. This research work has been supported by the Spanish Ministry of Edu-
cation and Vocational Training under an FPU fellowship (FPU17/03276). This work has been
partially funded by the Spanish Government Ministry of Economy and Competitiveness
throughout the DEFINES project (Ref. TIN2016-80172-R) and the Ministry of Education of the
Junta de Castilla y León (Spain) throughout the T-CUIDA project (Ref. SA061P17).
References
1. Sweeney, L.: Discrimination in online ad delivery. arXiv preprint arXiv:1301.6822 (2013)
2. Garcia, M.: Racist in the machine: the disturbing implications of algorithmic bias. World
Policy J. 33, 111–117 (2016)
3. Hajian, S., Bonchi, F., Castillo, C.: Algorithmic bias: from discrimination discovery to
fairness-aware data mining. In: Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, pp. 2125–2126. ACM (2016)
4. Keim, D.A., Andrienko, G., Fekete, J., Görg, C., Kohlhammer, J., Melançon, G.: Visual
analytics: definition, process, and challenges. In: Kerren, A., Stasko, J., Fekete, J., North, C.
(eds.) Information Visualization, pp. 154–175. Springer, Heidelberg (2008)
5. Thomas, J.J., Cook, K.A.: Illuminating the path: the research and development agenda for
visual analytics. National Visualization and Analytics Center, USA (2005)
6. Munzner, T.: Visualization Analysis and Design. AK Peters/CRC Press, Boca Raton (2014)
7. Harrison, L., Yang, F., Franconeri, S., Chang, R.: Ranking visualizations of correlation using
weber’s law. IEEE Trans. Visual Comput. Graph. 20, 1943–1952 (2014)
8. O’Neil, C.: On Being a Data Skeptic. O’Reilly Media, Inc., Newton (2013)
9. Patil, D., Mason, H.: Data Driven. O’Reilly Media Inc, Newton (2015)
10. Shah, S., Horne, A., Capellá, J.: Good data won’t guarantee good decisions. Harvard Bus.
Rev. 90, 23–25 (2012)
11. Bonneau, G.-P., Hege, H.-C., Johnson, C.R., Oliveira, M.M., Potter, K., Rheingans, P.,
Schultz, T.: Overview and state-of-the-art of uncertainty visualization. In: Scientific
Visualization, pp. 3–27. Springer, Heidelberg (2014)
12. Brodlie, K., Osorio, R.A., Lopes, A.: A review of uncertainty in data visualization. In:
Expanding the Frontiers of Visual Analytics and Visualization, pp. 81–109. Springer,
Heidelberg (2012)
13. https://medium.com/multiple-views-visualization-research-explained/uncertainty-visualizat
ion-explained-67e7a73f031b
14. Anscombe, F.J.: graphs in statistical analysis. Am. Stat. 27, 17–21 (1973)
15. Pollet, T.V., Stulp, G., Henzi, S.P., Barrett, L.: Taking the aggravation out of data
aggregation: a conceptual guide to dealing with statistical issues related to the pooling of
individual-level observational data. Am. J. Primatol. 77, 727–740 (2015)
16. Kramer, G.H.: The ecological fallacy revisited: aggregate-versus individual-level findings on
economics and elections, and sociotropic voting. Am. Polit. Sci. Rev. 77, 92–111 (1983)
17. Piantadosi, S., Byar, D.P., Green, S.B.: The ecological fallacy. Am. J. Epidemiol. 127, 893–
904 (1988)
18. Blyth, C.R.: On Simpson’s paradox and the sure-thing principle. J. Am. Stat. Assoc. 67,
364–366 (1972)
19. Wagner, C.H.: Simpson’s paradox in real life. Am. Stat. 36, 46–48 (1982)
20. Perez, C.C.: Invisible Women: Exposing Data Bias in a World Designed for Men. Random
House, New York (2019)
21. Alipourfard, N., Fennell, P.G., Lerman, K.: Can you trust the trend?: discovering Simpson’s
paradoxes in social data. In: Proceedings of the Eleventh ACM International Conference on
Web Search and Data Mining, pp. 19–27. ACM (2018)
22. Xu, C., Brown, S.M., Grant, C.: Detecting Simpson’s paradox. In: The Thirty-First
International Flairs Conference (2018)
23. Guo, Y., Binnig, C., Kraska, T.: What you see is not what you get!: detecting Smpson’s
paradoxes during data exploration. In: Proceedings of the 2nd Workshop on Human-In-the-
Loop Data Analytics, p. 2. ACM (2017)
24. Bickel, P.J., Hammel, E.A., O’Connell, J.W.: Sex bias in graduate admissions: data from
Berkeley. Science 187, 398–404 (1975)
25. Nickerson, R.S.: Confirmation bias: a ubiquitous phenomenon in many guises. Review of
general psychology 2, 175–220 (1998)
26. Hullman, J., Adar, E., Shah, P.: The impact of social information on visual judgments. In:
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems,
pp. 1461–1470. ACM (2011)
27. Kim, Y.-S., Reinecke, K., Hullman, J.: Data through others’ eyes: the impact of visualizing
others’ expectations on visualization interpretation. IEEE Trans. Visual Comput. Graph. 24,
760–769 (2018)
28. Mills, E.: ‘Leave No One Behind’: Gender, Sexuality and the Sustainable Development
Goals. IDS (2015)
29. Stuart, E., Samman, E.: Defining “leave no one behind”. ODI Briefing Note. London: ODI
(www.odi.org/sites/odi.org.uk/files/resource-documents/11809.pdf) (2017)
30. Abualghaib, O., Groce, N., Simeu, N., Carew, M.T., Mont, D.: Making visible the invisible:
why disability-disaggregated data is vital to “leave no-one behind”. Sustainability 11, 3091
(2019)
31. Rice, L., Barth, J.M.: Hiring decisions: the effect of evaluator gender and gender stereotype
characteristics on the evaluation of job applicants. Gend. Issues 33, 1–21 (2016)
32. Alford, H.L.: Gender bias in IT hiring practices: an ethical analysis (2016)
A Concrete Action Towards Inclusive
Education: An Implementation
of Marrakesh Treaty
Virginia Rodés1 and Regina Motz2(&)

1
Comisión Sectorial de Enseñanza and Núcleo de Recursos Educativos Abiertos
y Accesibles, Universidad de la República, Montevideo, Uruguay
[email protected]
2
Facultad de Ingeniería and Núcleo de Recursos Educativos Abiertos
y Accesibles, Universidad de la República, Montevideo, Uruguay
[email protected]
Abstract. This paper presents the experience gained on the implementation of

an accessible digital library containing collections of textbooks and open edu-
cational resources as an the institutional repository of the University, performed
on the framework of Marrakesh Treaty during 2018–2019.
Keywords: Accesibility Open educational resources Repository
1 Introduction
As stated by its declaration, the goal of the Marrakech Treaty (MT) is to “Facilitate
Access to Published Works for Persons Who Are Blind, Visually Impaired or Other-
wise Print Disabled”. The Marrakesh Treaty (MT) was adopted on June 27, 2013 in
Marrakesh and it forms part of the body of international copyright treaties administered
by WIPO [1, 3]. The treaty allows for copyright exceptions to facilitate the creation of
accessible versions of books and other copyrighted works for visually impaired per-
sons. It sets a norm for countries ratifying the treaty to have a domestic copyright
exception covering these activities and allowing for the import and export of such
materials.
Sixty three countries signed the treaty as of the close of the diplomatic conference
in Marrakesh. The ratification of 20 states was required for the treaty to enter into
effect; the 20th ratification was received on 30 June 2016, and the treaty entered into
force on 30 September 2016. The European Union ratified the treaty for all 28 members
on October 1, 2018. The MT is not an automatic application instrument, but it is a norm
that obliges the ratifying states to adapt their laws, so it requires that each country
implement it to start functioning within each jurisdiction. The International Federation
of Library Associations and Institutions (IFLA) periodically review whether govern-
ments have passed the necessary national laws to make a reality of Marrakesh, the
monitoring reports can be found at: https://www.ifla.org/publications/node/81925.
One of the key aspects of the TM is that it depends on the creation of international
networks with agile processes of production and exchange of accessible copies, trying
A Concrete Action Towards Inclusive Education 419
to avoid duplication of efforts. The greatest innovation of Marrakech lies in the

establishment of an international regime for the international transfer of copies in
accessible format, facilitating the exchange and strengthening the efficiency of those
entities authorized to carry out the production and distribution of this type of works.
These institutions must have a clear and compatible regulatory framework with that of
the entities of other countries, allowing, for example, the federation of repositories of
accessible digital resources.
Nowadays it is common that each University has an institutional repository for
digital collections that protect university’s production with the purpose of preserving its
memory, making it available to society as a whole, and giving it visibility and dis-
semination in order to promote the emergence of new knowledge [8]. Moreover, fol-
lowing the Open Education movement, most universities digital repositories are
regulated by an open access policy according with the Budapest Open Access Initiative
[12]: “All scientific knowledge must be free available and free through the internet to
be used for any legal purpose by any person, provided that the author is guaranteed the
integrity of his/her work and the right to be duly recognized and cited”. Our University
is the holder of the author’s economic rights for all the resources produced at the
university as a result of teaching, research, extension and management activities [2].
The works deposited in this repository are released through one of the six Creative
Commons licenses at the author’s choice. The collections include: Theses (under-
graduate and graduate dissertations, monographs or projects, dissertations that contain
a research that is submitted to obtain a degree), eBooks, presentations made at con-
ferences, research articles published in national and international journals, Technical
Reports, Project Reports and Audiovisuals (which includes radio, film, television,
video or any other production that includes images and/or recorded sound).
As in the majority case of universities, repositories are implemented on Dspace;
software and to retrieve accurate and effective information Dublin Core standard
metadata are used.
Without the copyright barrier raised by the Marrakesh Treaty, libraries and spe-
cialized organizations have the authorization to generate and provide the interested
party with the appropriate digital version for their use, enabling the development of
accessible digital content as pointed out by Hilera and Campo [13].
This paper describes the decisions taken to expand our university repository with
open educational resources and a collection of accessible textbooks for disabled people
in the framework of Marrackesh Treaty.
2 Open Educational Resource Repository
The need to have institutional OER repositories is mainly based on the objective of
achieving sustainability of resources, allowing to manage necessary changes so that the
resource remains usable, which can be located and referenced in a easy and precise
way. These are good reasons for institutions to invest in OER repositories, in addition
to considering the reputation benefits that an institution generates by the number of
times an open resource generated by its teachers is reused and from where it is
accessed, measuring also the national, regional or international scope of the institution.
420 V. Rodés and R. Motz
On the other hand, it is necessary to emphasize that many valuable educational

resources are generated and used by learning virtual spaces such as Moodle. However,
Moodle, like any other LMS, has a strong concept of centralization from the point of
view of material management. Materials and resources used in a course are inaccessible
once the course is finished. Considerable effort is then required for its complete export
with adequate semantic annotations that allow its subsequent recovery.
The first thing to note is that all material contained in an institutional repository,
and especially if the repository is governed by an open publication policy, is feasible to
be used as an OER. However, this is not enough to convert it into an OER repository.
In order to be a real OER repository it is not enough to simply host an OER collection,
it must treat the material with its specific OER characteristics. The key characteristics
are two: (i) that the educational resources comply with the characteristics of being
modifiable and reusable, and (ii) have meta-data or descriptions of the properties that
identify the educational use of the resource, for example, thematic, degree of difficulty,
previous knowledge required, etc.
In order to contain OERs that are modifiable and re-usable, as minimum require-
ment a repository must handle open support policies. This allows not only access to the
resources but also their subsequent manipulation and modification. At the same time it
is desirable that the repository put mechanisms in place to make sure that authors are
using the appropriate licenses when re-using.
The ability to manage educational meta-data requires to modify the meta-data of the
resources of the repository to address the domain of the OERs. The two most com-
monly used standards proposals for OER meta-data are currently LOM and Dublin
Core. LOM has been designed by the IEEE to describe OERs, including pedagogical
aspects such as teaching style or level of interactivity. Dublin Core is a more generic
scheme, which is defined by 15 basic descriptive elements. Its simplicity, flexibility and
adaptation makes it one of the most internationally used schemes. It is a basic
description mechanism that can be used in all domains, for all types of resources. It is
simple but powerful, which can be easily extended and can work together with other
specific solutions, as for example the Dublin Core meta-data standard for institutional
publications. For these reasons it was decided to extend it with a mixed proposal,
adding the descriptors corresponding to pedagogical aspects of LOM to the existing
Dublin Core descriptors.
The specific descriptors of LOM are: difficulty level of the resource, the context that
describes the educational environment for which it is appropriate to use the resource,
the version and the production status of the material, description of the material from
the pedagogical point of view, the density semantics that is a subjective measure of the
educational usefulness of the material compared to its size and/or duration, and level of
aggregation that defines the granularity of the material. However, none of these
aggregated LOM meta-data is defined as required to avoid overloading the OER into
the repository. In Table 1 we show the meta-data used for OER collections. In the
future, it is expected to fully generate this meta-data with the peer collaboration activity
planned in the space of learning communities associated with the repository.
It is also important to have a navigation mechanism within the repository that

allows users to easily identify the structure of materials available to them. In response
to the needs of the different university faculties and schools that interact with the
repository, an exclusive collection was opened for OERs where different faculties can
host their resources and users can access them freely. Then, for each faculty and service
it is necessary to maintain an area where exercise materials and another area of bib-
liographies are housed. OERs are linked in this way with the classic institutional
materials stored in the repository.
3 The OER Repository as an Accessible Digital Library
A digital library is an information system that allows the access and transfer of digital
information, structured around collections of digital documents on which services are
offered to users [5–8]. Digital libraries are the product of a deliberate strategy for the
development of collections by library professionals, where they often have content
beyond institutional ownership. Institutional repositories can offer limited services to
users, with regard to digital libraries that include important aspects of service, such as:
reference, assistance, interpretation of contents, that is, support of personnel in the
search for additional information.
However, according to Xia and Opperman [8], institutional repositories and digital
libraries currently offer similar services and the use of each term depends on the scope
where it is applied and therefore on the resources with which they wish to work.
In the case of our institutional repository, the differentiation of the Digital and
Accessible Library collection (BIDYA [4]) from the rest of the collections is based on:
(i) access to the BIDYA collection is done through user authentication, since access is
only open for people covered by the Marrakesh Treaty, and (ii) the materials that are
available in accessible versions of the BIDYA collection are text selections, in this first
phase primary and secondary textbooks. Figure 1 describes this process.
Unlike most institutional repositories where all users can access the material
without the need for a user name and password, due to the Marrakesh Treaty the
material can only be accessed by registered users. Using a button in the accessible
portal of BIDYA, one can directly access the user entry of the repository.
The limited availability of study material in Braille, audio, electronic support or
extended characters is one of the greatest difficulties encountered by students with
visual disabilities inserted into the education system. This proposal consists in the
creation of a book digitization system which is available online through a repository of
books and other materials in accessible formats. The Accessible Digital Library will
allow universal access without geographic distinction, physical barrier, or travel
restrictions.
Fig. 1. Process to insert a new user in BIDYA.
In order to efficiently retrieve the materials from the BIDYA library, it is essential
to have a set of meta-data for the cataloging and subsequent retrieval of the informa-
tion. It was decided to use the Dublin Core extended meta-data set due to its com-
patibility with other platforms, its adaptability to materials and its simplicity for
handling. The meta-data schema was extended to address two aspects. One corre-
sponding to the set of meta-data necessary to account for the digitization process
performed on the original material and the second corresponding to the set of meta-data
that refer to the level of accessibility achieved in the available material. This process
was carried out with focal meetings between representatives of the National Union of
Blinds of Uruguay, librarians and computer scientists.
Table 1. Uses of DC metadata for accessible resources.

dc.contributor.digitalizador Personal/Institution that performs scan process
dc.contributor.corrector Personal/Institution that performs correction process
dc.type Revised
dc.relation.publishversion Editor version
dc.identifier.citation ISBN of the scanned book or digital version sent by publisher
As mentioned above, the organization of the information in the Accessible Library

includes two Communities: Elementary school and High school. In order to facilitate
access and search, each community includes different collections that bring together the
documents by type of material. The elementary school community includes four col-
lections: textbooks, short stories, didactic sheets and languages. The high school
community contains two collections: textbooks and literature.
Concerning accessibility of resources, content providers transform the original
documents to accessible format, applying the Universal Design Principles (equitable
use, flexibility in use, simple and intuitive use, perceptible information, tolerance for
error, low physical effort, size and space for approach and use). There are two main
ways to get the resources: 1) scan the books or 2) work with the digital file if it already
exists. In general, the process includes: changes in typography and contrast, modifi-
cation of tables (use of simple tables, with headers, avoiding the inclusion of divided
cells or combined cells, columns and blank rows), alternative text is added to the
images and a description when they provide relevant information.
With regard to the format of the documents included in the Accessible Library, the
docx format is used because it has an open specification based on the XML mark
definition language, in compliance with the ordinance of the repository that only
supports the deposit of digital resources in open formats. It also adapts to the
requirements of the screen readers used by the end users of the Accessible Library.
With respect to workflow, the line established for the repository was followed,
respecting the different roles associated with the users (adding, editing, publication,
administration) and preserving the final review by the administrators of the repository.
This latter ensures the descriptions quality.
The registration protocol of authorized users was also defined, taking into account
the decree that regulates this exception. The decree (N° 295/017, https://www.impo.
com.uy/bases/decretos/295-2017) defined the figure of organizations authorized to
register users, and these organizations are the only ones that send the necessary data for
the creation and authentication of the users of these collections.
This Accessible Digital Library, is operating and available for elementary and high
school students, and in some cases for parents, tutors and teachers. As of December
2019, it hosts about 1000 resources, number that is expected to increase soon. New
collections will continue to be incorporated according to the needs of the target pop-
ulation, particularly those that favor access to and permanence in higher education.
Finally, it must be ensured that web access to the repository is accessible. For
accessibility, we apply an evaluation methodology that follows the steps of the WCAG-
EM proposed by W3C. Those steps consist of: 1) Defining the scope of the evaluation,
2) Explore the website, 3) Select the representative sample, 4) Evaluate the selected
sample and, finally, 5) Report the findings of the evaluation.
In step 1 we define the scope at AA level, taking into account that several inter-
national legislation recommend this level as the minimum level of accessibility required.
In step 2, we explore repository’s website and in step 3 we select the initial page of the
repository as an example to the repository. This selection is based on the consideration
that the page where the user searches for resources is the most relevant for recovery and
encounter with OER, fundamental objective of the repository. While a single page may
seem unrepresentative of the general state of the accessibility of the repository, the
selected page is the gateway to find and use resources, so it level of accessibility is
crucial for the experience of user. Finally, we evaluate this page manually and also using
the tool TAW1. This analysis showed a series of accessibility errors that were resolved.
One of the problems encountered was that the conformance criterion Non-text content
(numbered 1.1.1 in the WCGA Level A specification) indicates that all non-textual
content that is presented to the user has a textual alternative that serves the same
purpose. The benefit of this criterion is that the information can be interpreted through
any sensory modality such as a screen reader. Its absence implies access barriers for
blind users of screen readers and people with difficulties to understand the meaning of
some image, among others. Despite being probably the best known example of
1
TAW: http://www.tawdis.net/.
accessibility this is a criterion of very low compliance in several studies as indicated by

Adepoju and Shehu [10] and Flores et al. [9]. A comparison of accessibility level in
institutional OER Repositories can be found at Da Rosa and Motz [11].
The BIDYA Project is one of the first initiatives developed as an institutional repository
of OER in the framework of the implementation of the Marrakesh Treaty to facilitate
access to published works to blind people, visually impaired or with other difficulties to
access the printed text in Latin American. As a result of the activities carried out for the
implementation of the accessible digital library BIDYA, guides to the process of
digitalization and correction of the materials were written to make them accessible that
are published as open access and are currently working on a program with librarians for
a campaign of Digital Literacy for blind people.
Acknowledgement. This work was supported by the Innovation Sector Fund of the National
Agency for Research and Innovation (ANII) through the project ININ_1_2017_1_137280.
References
1. WIPO-Administered Treaties. https://www.wipo.int/treaties/en/ip/marrakesh/. Accessed 08
Jan 2020
2. Seroubian, M., de León Colibri, M.: Conocimiento libre repositorio institucional.
Montevideo, Jornada de Difusión sobre el Acceso Abierto (2014)
3. United Nations: Convención sobre los derechos de las personas con discapacidad (2006).
http://www.un.org/esa/socdev/enable/documents/tccconvs.pdf. Accessed 08 Jan 2020
4. Biblioteca Digital y Accesible (BIDYA) (2017). http://www.bibliotecaaccesible.ei.udelar.
edu.uy/biblioteca-digital-y-accesible-2-bidya/. Accessed 08 Jan 2020
5. Borgman, C.L.: What are digital libraries? Competing visions. Inf. Process. Manag. 35(3),
227–243 (1999)
6. Guo, L.: On construction of digital libraries in universities. In: 2010 3rd IEEE International
Conference on Computer Science and Information Technology (ICCSIT), vol. 1, pp. 452–
456. Presented at the 2010 3rd IEEE International Conference on Computer Science and
Information Technology (ICCSIT) (2010). https://doi.org/10.1109/iccsit.2010.5564750
7. Nguyen, S., Chowdhury, G.: Digital library research (1990–2010): a knowledge map of core
topics and subtopics. In: Digital Libraries: For Cultural Heritage, Knowledge Dissemination,
and Future Creation. Springer (2011). http://www.springerlink.com/content/21h17m2nh10kr
l1w/abstract/
8. Xia, J., Opperman, D.B.: Current trends in institutional repositories for institutions offering
master’s and baccalaureate degrees. Ser. Rev. 36(1), 10–18 (2010). https://doi.org/10.1016/j.
serrev.2009.10.003
9. Flores Ch., J., Ruiz C., K.J., Castaño, N., Tabares M., V., Duque, N.: Accesibilidad en Sitios
Web que Apoyan Procesos Educativos. Anales de la Novena Conferencia Latinoamericana
de Objetos y Tecnologías de Aprendizaje, LACLO 2014 (2014). http://www.laclo.org/
papers/index.php/laclo/article/viewFile/225/20. Accessed 08 Jan 2020
10. Adepoju, S., Shehu, I.: Usability evaluation of academic websites using automated tools. In:
International Conference on User Science and Engineering (i-USEr) (2014)
11. da Rosa, S., Motz, R.: Tenemos Repositorios de REA Accesibles. Handle Gredos de la
monografía completa. Ediciones Universidad de Salamanca (España) (2016)
12. JLIS.IT, Redazione. Budapest Open Access Initiative (2002). JLIS.it, [S.l.], v. 3, n. 2,
October 2012. ISSN 2038-1026. http://dx.doi.org/10.4403/jlis.it-8629. https://www.jlis.it/
article/view/8629. Accessed 08 Jan 2020
13. Hilera-González, J.R., Campo-Montalvo, E. (eds.): Guía para crear contenidos digitales
accesibles: Documentos, presentaciones, vídeos, audios y páginas web, 1st edn. Universidad
de Alcalá, Alcalá de Henares (2015)
Intelligent Systems and Machines
Cloud Computing Customer
Communication Center
George Suciu1,2, Romulus Chevereșan1, Svetlana Segărceanu1,

Ioana Petre1(&), Andrei Scheianu1, and Cristiana Istrate1
1
R&D Department, BEIA Consult International, Peroni 16,
041386 Bucharest, Romania
{george,romulus.cheveresan,svetlana.segarceanu,
ioana.petre,andrei.scheianu,cristiana.istrate}@beia.ro
2
Telecommunications Department, University POLITEHNICA of Bucharest,
Bd. Iuliu Maniu 1-3, 061071 Bucharest, Romania
[email protected]
Abstract. Call centers represent the basis in today’s market. Nowadays, in

every enterprise exist a call center which has the main objective to handle
client’s issues. This research paper refers to a communication system designed
for automatic client interaction, called “Cloud Computing Customer Commu-
nication Center”. The communication center comprises: a Unified Communi-
cations System (UCS) providing communication channels such as telephony,
SMS, email, video, chat, etc.; a Data Processing Server (DPS) which provides
support for the operations required by customers and a Voice Interaction
Module (VIM) for automatic communication with clients, which consists of a
speech recognition module, a speech synthesis module and a dialog manage-
ment module that communicates through a MRCP (Media Resource Control
Protocol) protocol with the Unified Communication System (UCS) and the Data
Processing Server (DPS). The “Cloud Computing Customer Communication
Center” integrates support modules for platform functionality, such as: redun-
dant power supply, cloud-based LAN/Wi-Fi broadcasting environment, online
platform management and various cloud applications.
Keywords: Cloud-Computing Unified Communication System

Data Processing Voice Interaction Module MRCP
1 Introduction
The scope of the communication system is the interaction with customers in call
centers. Usually, customer interaction is performed via call centers where agents of
different organizations provide customer service over the phone. Call centers require
centralized headquarters for receiving and transmitting a large volume of requests.
Through the “Cloud Computing Customer Communication Center” system, the call
center is automated, organizations using the provided infrastructure through this sys-
tem. A “Voice Interaction Module” has been developed, which is an intelligent voice
communication system to be integrated into a Unified Communications Platform. By
using this solution, the client communicates with the Communication Center system in
430 G. Suciu et al.
Romanian, through natural language that is recognized and analyzed semantically. The
customer is directly connected to the requested entity so that the transactions take place
as soon as possible. Nowadays, there is a tendency to move from “Call Center”
solutions to “Cloud Client” solutions that in addition to “Call Center” integrate unified
channel management solutions and communications applications, text-to-speech
(TTS) and speech-to-text (ASR) applications.
The simplified scenario of an inbound call center (which processes incoming calls)
is the following: a client calls the number assigned to that call center, and once logged
into, accesses the IVR (Interactive Voice Response) and identifies. Through IVR it can
access or even complete certain specific simple transactions automatically and if an
operator is available and the accessed service was not completed, the client is redirected
to that operator.
The main trends and technologies involved in nowadays call center configuration
are natural language processing and artificial intelligence (AI) such as deep learning
(DL). Voice-based interfaces are included in order to facilitate communication with
users and improve the quality of the supplied services. This trend is illustrated in [1]
that points out that there are tens of millions of devices in the world having a vocal
main interface. Such devices are operated through platforms that include speech
recognition software (ASR, speaker) and speech synthesis (TTS). An important role of
the devices that use voice as the main interaction method (voice first) is an intelligent
agent or assistant for its users. One of the most popular applications are the ones for
voice banking or voice payments where artificial AI Nuance and Personetics are leaders
in implementing voice support solutions.
Using last minute techniques can solve recognition problems, such as speech
recognition, to ensure high accuracy. For example, DNN (Deep Neural Networks) is a
day-to-day technique used for speech recognition. Various open-source applications for
developing voice-based systems have already implemented it, including Kaldi [1],
Nuance [2] and PoketSphinx [3]. Being an active research area, it is difficult to keep
them up to date with constant technological updates that may involve source code
modifications or even rethinking the architecture of the application. For example, Kaldi
[4, 5] has three different source codes for DNN. The adoption of “Cloud Computing”
technology allows immediate, permanent, convenient and on-demand access over the
network to a shared resource base (server networks, storage resources, applications and
services) that can be used or released involving managerial effort or interaction with the
minimum service provider.
The rest of the paper is organized as following: Sect. 2 analyzes related work,
Sect. 3 presents the developed system, including requirements and key performance
indicators, Sect. 4 presents the testing the functionality of the Cloud Computing
Customer Communication Center, while Sect. 5 draws the conclusions and envisions
future work.
Cloud Computing Customer Communication Center 431
2 Related Work
The advantages of the “Cloud” Call Center technology, according to [6], are not just
about costs. This technology brings many other benefits such as scalability, flexibility,
ease of use or development as well as reliability. Among the advantages reported by the
enterprises that have adopted the model of “Cloud Computing” as a solution for the call
center are: ease of use, reliability in case of disaster and an increased security level.
Therefore, the uses of this feature are very varied. For example, the calibration of the
system according to the customer’s needs and emotions, thereby improving staff
training, the recording and processing of the complaints and ultimately to improve the
performance of the organization. This domain, also called audio mining, includes
elements of speech recognition, semantic speech analysis and identification of voice
characteristics [7].
Among the elements to be adopted by a successful call-center are [8]:
1. Real-time speech analysis (Speech Analytics), which can refer to several aspects:
(a) identifying vocabulary elements, (b) detecting emotions or prozoda, (c) Recog-
nition of the voice tag.
2. Text Analysis (Text Analytics) from organizational interactions with customers
through SMS, email, fax, chat, messenger. This element processes and identifies
keywords allowing detection of words or phrases related to a particular topic. Any
automated alerts related to the content of the processed text may be generated.
3. Chatbots: 2017 was probably the year when more and more centers adopted the
chatbot technology to handle a large volume of simple calls or requests. Chatbots
will not replace the human operators, but they will relieve them of some of their
activities to allow them to focus on more complex issues.
4. MRCP (Media Resource Control Protocol) implementation, which is a standard for
the developers of telephony and voice technologies based applications.
5. The adoption of “Cloud Computing” technology that allows immediate, permanent,
convenient and on-demand access over the network to a shared resource base.
Although it has been observed that there is a high degree of interest in robotized
voice communication with customers from different fields, such as banking (ING),
GSM (Vodafone), utilities service providers (Engie), an analysis of the current
world or local status suggests that these applications are currently limited to a
restricted set of operations. The estimated performance of these applications is
below 50%. However, it is estimated that over the next ten years, 50% of banking
operations will be performed through Voice First solutions.
The disadvantages of the methods mentioned above do not offer a unified com-
munications center that allows the unification of communications (fixed and mobile
phone, fax, e-mail, internet access, SMS, etc.) using TTS and ASR technologies; the
access to the processed data cannot be performed anytime, anywhere and from any
terminal that can access the Internet. In the analyzed methods utilized for verbal
interaction with customers, the extracted information from the user’s dialogue is not
stored in specific data structures to be used for later processing (Speech Analysis) [8].
432 G. Suciu et al.
3 Cloud Computing Customer Communication Center

Description
In this section are presented the main requirements, architecture, implemented appli-
cations and key performance indicators of the “Cloud Computing Customer Com-
munication Center” system.
A. Requirements
The main functional and technical requirements based on the analysis presented in
the related work section are:
– Reducing personnel costs as a result of automating interaction with customers
(direct voice);
– The integration of a voice interaction module with clients of the beneficiary orga-
nization. The module can be configured according to the scheme and the type of
dialogue adapted to the type of application;
– “Cloud Computing” technology utilization will introduce several advantages:
scalability, flexibility, simplicity, ease of use/development, reliability in case of
disasters and security;
– Increasing the Service Level value or maintain it with a small number of call center
agents;
– Decreasing the number of abandoned calls;
– Enhancing the quality of customer service by reducing the waiting time;
– Real-time visualization and adjustment of the KPIs (Key Performance Indicators);
– Optimization of agent monitoring reports;
– Increasing the level satisfaction of the agents by offering them the opportunity to
work in their available time slots;
– Increasing retention;
– Reducing the work schedule;
– Performance monitoring.
B. General architecture
Figure 1 illustrates the general architecture of the “Cloud Computing Customer
Communication Center” system and its functional components. This architecture
allows:
a) implementation exclusively on Cloud for reduced costs;
b) hybrid implementation, where end customers purchases hardware such as: PBX
(Private Branch Exchange), IP phones, DECTs, etc.
Fig. 1. The basic architecture of the 5C system
The figure above illustrates the case of implementation exclusively in the Cloud
(a) with blue color. The caller agent (within a company/call-centers) benefits from 5G
facilities through a SaaS (Software as a Service) service and/or VPN (Virtual Private
Network), and customers interact with the agent via Internet/2G, 3G and 4G mobile
phone services (via GSM to VoIP Trunk Gateway). The beneficiary can install and
configure in “Cloud” several equipments, for example: Unified Communications
System, Data Processing Server, Voice Interaction Module, IP Phones, DECTs (Digital
Enhanced Cordless Telecommunications), etc.
In the case of hybrid implementation (b), the components in green color frames
were added. Customers can make phone calls directly through their local PBX,
reducing the Internet traffic between Cloud and headquarters. In this way, various
problems such as QoS (Quality of Service) in case of large-scale call centers are
avoided. The first category of implementation, exclusively in “Cloud” (a), can be
dedicated to small companies that cannot afford to purchase additional hardware and
only pay a monthly subscription. The second variant of implementation (b), namely the
hybrid implementation, can be used in large-scale call centers, where the quality of
conversation is of great importance in voice recognition (MIV mode).
An intelligent voice communication system has been developed to be integrated
into a Unified Communications Platform (UCP). By using this solution, the client
communicates with the Communication Center System, in Romanian using natural
language that is recognized and analyzed semantically [9]. The customer is directly
connected to the requested entity so that the transactions take place as soon as possible.
Figure 2 illustrates the system architecture of the patent pending system. The patent
was published in the “Buletinul de Proprietate Industrială”, section “Brevete de
invenție”, no. 11/2018.
434 G. Suciu et al.
Fig. 2. System architecture
The figure presented above integrates additional modules and the “Voice Interac-
tion System”, modules which will be described in the next subsection.
C. Voice processing applications implemented
The realization of the Communication Center and 5C Platform consisted in the
implementation of several voice processing applications (Speech-to-Text-ASR, Text-
to-Speech-TTS) and of a Dialogue Management Module (M-DIAG) in Romanian,
connected to a Unified Communication Platform, dedicated to several types of appli-
cations (fixed and mobile phones, fax, e-mail, messaging, Internet communications,
etc.) in various fields of activity (banks, Local or Central Public Administration, media
agencies, retail chains, etc.). Finally, the natural voice dialogue solution is an inter-
active system with a voice-based interface to facilitate the communication with users
and improve the quality of the provided services.
The “Voice Interaction System” module consists of the following elements, as
presented in Fig. 2:
– Speech Recognition Module, which translates what the user speaks. Its input is the
voice of the user and the output is represented by the transcription of the speech;
– Voice Synthesis Module has the role of generating the voice signal corresponding
to the system response to the user. Inputs are represented by the generated responses
in the Dialogue Management Module;
– The Dialogue Management Module is designed to generate appropriate responses,
extract the needed to transmit user requests and connect the user with the human
operator where applicable;
– The back-end adapter (which contains the database and the data processing
resources of the beneficiary) and the MRCP Interface form the module that
implements the communication protocols for transmitting messages to the systems
performing specific functions;
– Call Center agent (human operator) intervenes when the system cannot fulfill the
user request.
Additional modules can be integrated in the system, such as:
– Power supply system, provided by renewable energy sources that can come from
photovoltaic panels and/or wind farms, the system endurance being ensured by the
presence of an electric generator system;
– Transmission environment can be achieved through a LAN network and Wi-Fi
components. The equipments mentioned above can be managed locally or through a
Cloud management module. This environment will have to accept external Internet
connections (VPNs);
– Multimedia Connectors should accept audio, video and text sources that can come
from the GSM environment (4G/5G), VoIP (Voice Over IP) and multimedia
channels from web pages;
– Online management (parameters from the transmission environment);
– Reports, Decision Support Management System (ML – Machine Learning);
– Agents/Customers will interact with fixed or wireless phones, smart devices
(smartphones, tablets, PC, etc.).
D. Key Performance Indicators

The “Cloud Computing Customer Communication Center” has the following
advantages:
– Includes a Voice Interaction Module which allows interaction with beneficiary
clients of the organization which can be configured according to the scope and the
scheme of the dialogue, adapted to the type of application of different beneficiary
organizations;
– The Voice Interaction Module is customized to the Romanian dialogue;
– Implements the Speech Recognition Module (ASR) using the most performant
technologies available on the market in processing and modeling the speech
recordings, such as Deep Neural Networks (DNNs) which guarantee high recog-
nition performance. Hybrid schemes can be used, an example being Deep Neural
Network - Hidden Markov Model (DNN-HMM) [10];
436 G. Suciu et al.
– Allows flexibility in choosing the Speech Recognition Module (ASR) depending on

the amount of data available on training and the vocabulary used; DNN technology
provides excellent performance for a high amount of training material;
– Allows the utilization of two alternative operating schemes: supervised dialogue
through a rigid dialogue scheme and conversation by using a less restrictive
vocabulary specific to the beneficiary’s domain;
– The usage of Cloud Computing technology brings other benefits: scalability, flex-
ibility, simplicity, ease of use/development, security, reliability in case of disasters
[11].
4 Testing the Functionality of the Cloud Computing

Customer Communication Center
A. IVR Voice Recognition Testing
This experiment was performed for testing the speech synthesis and recognition
modules. The test scenario involved the following actions:
– Call initiation to the dedicated IVR interior number;
– Playback welcome message;
– Speak the word/words to be recognized;
– Voice recognition is performed, and a result is returned;
– The obtained result is transmitted to the voice synthesis server;
– The message obtained from speech synthesis is played;
– The obtained result is also confirmed by voice recognition (“Was the word rec-
ognized correctly? is answered with “YES” or “NO”. If the answer is “NO”, it
returns to the voice recognition loop for a new attempt).
Based on the diagram presented in Fig. 3, the IVR project for integrating the voice
recognition service were defined:
– an MRCP session with a unique ID, initiated by calling the ASRChannelStart
function;
– there are three functional blocks for defining a voice recognition vocabulary. The
first block sets the path to the .SRGS file, the second block labels the vocabulary,
and the third block sends the .SRGS file for the vocabulary activation. The
vocabulary for the DTMF recognition, the vocabulary of words and numbers, as
well as the “YES or NO” vocabulary for the confirmation of the results are defined;
– an informational message is played using the PlayAsync function, which allows the
execution of instructions during playbacks, and when the “listening loop” is
reached, the message playback is stopped if a voice is detected (using the VAD
function) to perform the voice recognition;
– the “Media-Type” header of the MRCP messages that will be transmitted with
“audio/x-alaw-basic”, by which the compression law used is set, is initialized;
Fig. 3. The implementation diagram of the voice recognition service
– the specific header of the MRCP messages is set as “Save-Waveform” = “false” so

that the transmitted audio stream to the ASR server may not be saved;
– a series of parameters (times) used in the “listening loop” and in the speech
recognition process are transmitted, such as: No Input Timeout – the maximum time
of voice inactivity for which it’s decided that the recognition was performed
completely; Speech Incomplete Timeout - voice idle time until the next utterance of
a word after a partial result has not been found in any vocabulary; Speech Complete
Timeout - the waiting time for a new utterance when a match of the previously
spoken word has been made in one of the vocabularies; Recognition Timeout - the
maximum time during which the voice recognition is performed; Sensitivity – sets
the sensitivity level.
– the incoming audio stream within the “listening loop” is permanently processed
based on the parameters previously set and when the MRCP server sends a com-
plete acknowledgment message it exits the “listening loop”;
– the result transmitted to the IVR is in XML format. At this stage, the actual result
with which a variable is initialized is extracted;
– the obtained result is concatenated within a SSML file (example: SSML code “the
recognized word is” + ASRResult SSML code);
– the created SSML file is transmitted using the SpeakSSML function to the TTS
server and the returned audio message is played, such as “The recognized word is
RASResult”;
– the message “Was the word recognized correctly?” is played and it’s passed into the
“listening loop”;
– the result received from the ASR server is entered as an input parameter in a block
of type “if”. If the result is “YES” the call ends, if the result is “NO” the whole
procedure is repeated.
438 G. Suciu et al.
5 Conclusions
This paper provides an analysis of related work and presents the developed system,
including requirements and key performance indicators, taking into account the global
market needs regarding the new ways of ensuring the most efficient channels of
communications. The paper was derived from the project POC-5C and the resulting
patent “Cloud Computing Customer Communication Center” for improvement of such
communication technologies. As future work we envision to provide measurement
results from a practical implementation.
Acknowledgements. This work has been supported by a grant of the Ministry of Innovation and
Research, POC-5C project.
References
1. Ravanelli, M., Parcollet, T., Bengio, Y.: The PyTorch-Kaldi speech recognition toolkit. In:
ICASSP IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP), pp. 6465–6469 (2019)
2. Këpuska, V., Bohouta, G.: Comparing speech recognition systems (Microsoft API,
Google API and CMU Sphinx). Int. J. Eng. Res. Appl. 7(03), 20–24 (2017)
3. Huggins-daines, D., Kumar, M., Chan, A., Black, A.W., Ravishankar, M., Rudnicky, A.I.:
PocketSphinx: a free, real-time continuous speech recognition system for hand-held devices.
In: Proceedings of ICASSP, pp. 1–8 (2006)
4. Plátek, O.: Speech recognition using Kaldi. Masters thesis, Charles University (2014)
5. Kaldi documentation. http://kaldi-asr.org/doc/
6. Suciu, G., Toma, Ş.A., Cheveresan, R.: Towards a continuous speech corpus for banking
domain automatic speech recognition. In: 2017 International Conference on Speech
Technology and Human-Computer Dialogue (SpeD), Bucharest, pp. 1–6 (2017)
7. Gergely, T., Halmay, E., Szőts, M., Suciu, G., Cheveresan, R.: Semantics driven intelligent
front-end. In: 2017 International Conference on Speech Technology and Human-Computer
Dialogue (SpeD), Bucharest, pp. 1–6 (2017)
8. Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Pearson Education
Inc., Prentice Hall, London (2009)
9. Toma, S.A., Stan, A., Pura, M.L., Bârsan, T.: MaRePhoR – an open access machine-
readable phonetic dictionary for Romanian. In: SpeD (2017)
10. Mustafa, M., Allen, T., Appiah, K.: A comparative review of dynamic neural networks and
hidden Markov model methods for mobile on-device speech recognition. Neural Comput.
Appl. 31(2), 891–899 (2019)
11. Shah, N.B., Thakkar, T.C., Raval, S.M., Trivedi, H.: Adaptive live task migration in cloud
environment for significant disaster prevention and cost reduction. In: Information and
Communication Technology for Intelligent Systems, pp. 639–654 (2019)
International Workshop on Healthcare
Information Systems Interoperability,
Security and Efficiency
A Study on CNN Architectures for Chest
X-Rays Multiclass Computer-Aided Diagnosis
Ana Ramos1(&) and Victor Alves2

1
Department of Informatics, School of Engineering, University of Minho,
Braga, Portugal
[email protected]
2
Algoritmi Centre, University of Minho, Braga, Portugal
[email protected]
Abstract. X-rays are the most commonly used medical images and are
involved in all areas of healthcare because they are relatively inexpensive
compared to other modalities and can provide sensitive results. The interpre-
tation by the radiologist, however, can be challenging because it depends on his
experience and a clear mind. There is also a lack of specialized physicians,
mainly in the least developed areas, which increases the need for alternatives to
X-ray analysis. Recent research shows that the development of Deep Learning
based methods for chest X-rays analysis has the potential to replace the radi-
ologists analysis in the future. However, most of the published DL algorithms
were developed to classify a single disease. We propose an ensemble of Deep
Neural Networks that can classify several classes. In this work, the network was
used to classify five chest diseases: Atelectasis, Cardiomegaly, Consolidation,
Edema, and Pleural Effusion. An AUC of 0.96 was achieved with the training
data and 0.74 with the test data.
Keywords: Chest X-rays Deep Learning Medical imaging
1 Introduction
Nowadays, diagnostic imaging techniques such as Computed Tomography (CT), X-

rays, or Magnetic Resonance Imaging (MRI) are commonly used in healthcare to
diagnose, plan and control the treatment and monitoring of disease progression [1].
Chest X-rays are still among the most widely used clinical exams for diagnosing or
monitoring chest diseases such as pneumonia, pulmonary edema or cardiomegaly [2].
The availability of radiological equipment, a sensitive result and the low costs promote
their use [3, 4]. The radiologist’s interpretation can be quite challenging due to their
heavy workload burden and the subjectivity of X-rays reading. Interest in the combi-
nation of image processing technology with medical imaging and machine learning
methods has increased over the years [5–11].
Convolution Neural Networks (CNNs) have been extremely successful in providing
Deep Learning (DL) solutions for healthcare [12–14]. These networks are able to detect
various diseases and use different medical images, such as the detection of diabetic
retinopathy in fundus photographs or detection of skin cancer from digital photographs
442 A. Ramos and V. Alves
of skin lesions [15, 16]. Recent approaches using pre-trained Deep Neural Networks
(DNN) with nonmedical images for medical applications, Transfer Learning technique.
Lakani et al., studied the use of DNN, namely AlexNet and GoogLeNet, which were
pre-trained with the ImageNet dataset, to detect pulmonary tuberculosis in chest X-
rays, and achieved satisfactory results, an AUC of 0.99 [6].
The analysis of X-rays images is a crucial task for radiology experts. Most DL
classification approaches perform a binary classification, i.e., detects a single label.
Stephen et al. [7] investigated the use of a DL approach to detect pneumonia in chest
X-rays and achieved an accuracy of 95.31%. Liu et al. [8] used chest X-rays images to
detect tuberculosis and achieved an accuracy of 85.68%. Also, Yates et al. [9], studied
an approach to detect chest anomalies in X-rays and achieved an accuracy of 94.60%.
However, a multiclassification can also be performed, e.g., Yaniv et al. [10] used a
CNN pre-trained with a non-medical dataset (ImageNet) to detect pleural effusion,
cardiomegaly, and mediastinal enlargement in chest X-rays, and achieved an AUC of
0.86.
According to a study published in JAMA Network Open in March 2019, an
Artificial Intelligence (AI) algorithm was able to analyze chest X-rays and classify the
diseases even better than radiologists. The algorithm used was the Lunit Insight for
Chest Radiography and was trained using 54,221 chest X-rays with normal findings
and 35,613 chest X-rays with abnormal findings. The validation dataset contained 486
normal chest X-rays and 529 chest radiographs with abnormal results. To compare the
performance of the AI algorithm with the performance of radiologists, five non-
radiology physicians, five board-certified radiologists, and five radiologists examinated
a subset of the validation dataset [17].
Table 1. Performance of AI algorithm and of radiologists in chest X-rays analysis.

Radiologists AI algorithm Radiologists with assistance
of AI algorithm
Image classification 0.814–0.932 0.979 0.904–0.958
Lesion localization 0.781–0.907 0.972 0.873–0.938
Table 1 shows that the AI algorithm performed significantly better than the radi-
ologists. The emergence of publicly available annotated medical image datasets has
been a powerful mechanism for the development of DL based diagnostic models since
one of the main limitations has been the lack of labelled data. The public dataset used in
this work was the CheXpert dataset [18]. This dataset is also used for the CheXpert
Challenge, which aims to classify five chest disorders. Our proposed ensemble of Deep
Neural Networks also perform the classification of five labels, but the number of classes
can be easily extended.
A Study on CNN Architectures for Chest X-Rays 443
2 Materials
The dataset used was the CheXpert (Chest eXpert), which contains 224,316 chest X-
ray images of 65,240 patients. The images were collected from October 2002 to July
2017 in Stanford Hospital, along with medical reports from radiologists. The dataset
contains the already separated train and validation data with 224,113 and 203 X-ray
images, respectively, each with frontal and lateral views [18] (Fig. 1).
Fig. 1. Example of images in the CheXpert dataset.
The label extraction from the radiological reports was done in three steps: “Mention
Extraction”, “Mention Classification” and “Mention Aggregation”. The first, “Mention
Extraction”, summarizes the main findings in the section Impression of the reports [18].
The Impression is one of the most important sections in the reports of the radiologists
[19]. There, the professionals summarize the clinical impression, relevant clinical
information and laboratory findings achieved by all image features and make conclu-
sions and suggestions about the patient’s health [18, 20]. The second step, “Mention
Classification” uses the synopses sentences of “Mention Extraction” and classifies the
mentions as Positive if the observation showed evidence of a pathology; Negative if
there are no pathological findings; and Uncertain, expressing uncertainty and ambi-
guity of the report. Finally, “Mention Aggregation” group all 14 labels consisting of
“No Finding”, “Support Devices” and 12 pathologies. For each of the 14 labels one of
the following values can be assigned:
• Positive, 1, if it had at least one positive mention.
• Negative, 0, if it had at least one negative mention.
• Uncertain, u, if it had not positive mentions and at least one uncertain mention.
• Blank, if it had no mention on the observation.
The label “No Finding” was positive when it didn´t have a disorder classified as
Positive or Uncertain. Only 5 of the 14 labels were used in this work: Atelectasis,
Cardiomegaly, Consolidation, Edema, and Pleural Effusion. Each one was assigned as
Positive, Negative, or Uncertain. This was also the labelling used in the CheXpert
Challenge [18, 21].
3 Methodology
In this work, various experiments were performed to classify chest diseases. The tested
neural networks were trained using the CheXpert dataset to detect five different chest
disorders. Since the images are not all the same size, they were all resized to
320 320 pixels. The frontal and lateral X-ray views were processed separately, i.e.
each network inputs are only from a specific view. In the first experiments, only the
frontal view was considered, as it contains more information than the lateral view.
Besides, in the frontal view, both lungs are visible and there are more cases available
than in the lateral view.
3.1 Data Analysis

Figure 2 illustrates the distribution of Positive cases for each class in terms of training
and validation data for frontal and lateral views.
Fig. 2. Distribution of Positive cases of each class for frontal and lateral views.
Both the training and validation data are unbalanced. To mitigate this problem, an
asymmetric loss function was used which mapped the class indexes to a weighting
value, based on the class frequency, Eq. 1.
Nc Ni
Weighti ¼ ð1Þ
Ntotal
Where, i denotes the index of the class, Nc denotes the number of classes, Ni denotes
the number of cases of a specific class and Ntotal denotes the total number of cases.
3.2 Deep Neural Networks

In this study, we used several CNN architectures to classify chest X-rays images into
five different classes. For the first experiment, we created a typical CNN architecture,
Model 1, which is illustrated by Fig. 3. Model 1 is based on convolution and pooling
layers, ending with three Fully Connected Layers (FCL) to classify the images based on
extracted features. The last FCL has five nodes with a sigmoid activation function
corresponding to the five chest X-rays classes to be classified.
Fig. 3. Model 1and Model 3. conv stands for a convolutional and FC stands for a Fully
Connected layers.
Model 2 was achieved by replacing the typical convolutional layers in Model 1

with Depthwise Separable Convolutions. These types of layers were first introduced in
2014 at an International Conference on Learning Representation (ICLR) presentation
and achieved good results in image classification models [22–24].
The next architecture studied was a CNN similar to Model 1, whose first two
convolution layers were initialized with weights from a VGGNet model pre-trained on
ImageNet, transferring the knowledge from a different domain and task [25]. Previous
studies in the literature used transfer learning where pre-trained with non-medical data
CNN was used to train with medical data [9, 10]. Because there is a lack of labeled data
in medical image analysis, transfer learning methods allow the model to begin training
with prior knowledge about important image features. Model 3 also ends with a five
node FCL with a sigmoid activation function.
For the final experiment, Model 4, we used an Inception-v3 based network adapted
from Szegedy et al. [26]. It has lower computational costs and smaller number of
learning parameters than Model 3, dropping from 217,884,997 to 21,813,029 learning
parameters. Inception based networks are a good option in situations of scarce com-
putational resources [27].
3.3 Evaluation
The CheXpert paper [18] describes various approaches on using Uncertain instances
during the training phase. Our study used the binary mapping approach. With binary
mapping, the Uncertain values can be mapped to Negative (U-zeros) or Positive (U-
ones). It also introduces the Mixed approach, which uses the U-ones for Atelectasis,
Edema and Pleural Effusion, and U-zeros for Cardiomegaly and Consolidation.
Since the loss function used is asymmetric and the different approaches lead to a
different distribution of cases, Table 2 presents the weights of each class for frontal and
lateral views, calculated using Eq. 1.
Table 2. Weights for each class.

View Binary Atelectasis Cardiomegaly Consolidation Edema Pleural
mapping effusion
Fontal Mixed 0,8188 2,0861 3,7575 0,7933 0,5641
U-zeros 1,2965 1,6477 2,9679 0,7757 0,5011
U-ones 0,9231 1,8278 1,4721 0,8944 0,6360
Lateral Mixed 0,7441 1,5503 3,1136 1,4997 0,4943
U-zeros 1,1450 1,1580 2,3256 1,6282 0,4507
U-ones 0,8700 1,3118 1,2696 1,7534 0,5779
Mixed was the first binary mapping approach used in this work. The results
obtained were compared to the best AUC values of the CheXpert paper [18]. Several
experiments were evaluated and analyzed, testing different network architectures, the
binary mapping approaches and the effect of using Leaky ReLU function instead of the
ReLU function.
4 Results
All performed experiments used the SGD optimizer and the Categorical Cross-entropy
loss function. A callback was used to trigger an early stop when three consecutive
epochs achieved the same loss result.
Table 3 presents the parameters used Table 3. Parameters used in the tests.
in the various tests. Parameter Value
The results obtained by training Batch size 22
the network with frontal data and Learning rate 1 10−3
Mixed approach are given in Table 4, Learning rate decay 1 10−6
where ACC stands for accuracy and Number of epochs 120
Diff stands for the subtraction of the
mean of the AUC scores obtained in
this study by the mean of the AUC scores published in the CheXpert paper.
Table 4. Results using frontal images and the Mixed approach.

Binary mapping Model Loss AUC ACC Diff
Mixed Model 1 0,8805 0,8573 0,7745 0,2180
Model 2 1,2395 0,9093 0,6879 0,1860
Model 3 1,0813 0,9640 0,6949 0,1750
Model 4 1,0818 0,9164 0,7308 0,2000
When analyzing the AUC score, Model 3 performed best followed by Model 4. For
this reason, these two models were tested considering the U-zeros and the U-one’s
approaches, Table 5.
Table 5. Results using frontal images and the U-zeros/U-ones approach.

U-zeros Model 3 0,8467 0,9590 0,7546 0,2110
Model 4 0,9643 0,9112 0,7474 0,2050
U-ones Model 3 1,4102 0,9371 0,6521 0,1940
Model 4 1,6568 0,8946 0,6332 0,1820
It can be seen from Table 5 that a better AUC value was obtained for Model 3
using the U-zeros approach. The results of training with the Leaky ReLu, using
a = 0.3, and with frontal data are shown in Table 6. The Mixed approach was con-
sidered as it performed best in the previous tests.
Table 6. Results obtained using Leaky ReLU activation function and frontal data.
Mixed Model 3 1,1248 0,9558 0,6908 0,2170
Model 4 1,0821 0,9561 0,6960 0,1810
The results obtained when testing the different binary mapping approaches using
Model 3 and Model 4 with lateral images are shown in Table 7. Since the best overall
performance in frontal images was achieved with ReLU activation, it was also used for
the lateral images.
Table 7. Results obtained using ReLU function and lateral images.

Mixed Model 3 0,7808 0,9684 0,7645 0,0656
Model 4 0,9033 0,8737 0,7738 0,1100
U-zeros Model 3 0,6080 0,9378 0,8157 0,1290
Model 4 1,8714 0,7434 0,4125 0,2470
U-ones Model 3 1,0657 0,9202 0,7229 0,0656
Model 4 1,4285 0,7607 0,6963 0,0766
Table 8 presents our and the CheXpert paper AUC results for each class, when
predicted using the validation data while considering the lateral and frontal views. For
our results Model 3 was used.
Table 8. Predicted results with validation data.

Classes AUC CheXpert AUC Diff
Atelectasis 0.5761 0.8580 0.2819
Cardiomegaly 0.7393 0.8540 0.1147
Consolidation 0.7636 0.9390 0.1754
Edema 0.7860 0.9410 0.1550
Pleural effusion 0.8495 0.9360 0.08651
Average: 0.7429 Average: 0.9056 0.1627
5 Discussion
Depthwise Separable Convolutions, used in Model 2, are efficient compared to the

standard convolution layers, used in Model 1. These layers apply a single filter for each
input channel and then creates a linear combination of the output to generate new
features to the next layer and achieved a better AUC. In addition, the computational
cost decreased as expected theoretically, since the number of learning parameters
dropped from 97,225,076 to 97,167,909. Chollet [22] and Chen et al. [28] also found
the efficacy of Depthwise Separable Convolutions, because these layers showed effi-
cient use of the learning parameters and faster processing than the typical convolutional
layers.
In Model 3, using the Leaky ReLU function instead of the ReLU function had
almost no effect on the results. Since with ReLU function significant neurons were
inactivated if the entry was negative, this supports that the non-shared information
blocked by ReLU function was irrelevant to the learning performance of Model 3.
However, Model 4 performed better with Leaky ReLU, supporting that in this case the
blocked information was relevant. When using the Leaky ReLU, this information was
passed to the next layers and the learning capacity of the network improved. Wang
et al. [29] also achieved better performance in detecting Alzheimer’s when using a
network with the Leaky ReLU instead of the ReLU activation function. However, the
Leaky ReLU has a higher computational cost than the ReLU function.
Overall, Model 3 performed better when compared with Model 4. Comparing to the
various binary mapping approaches, the U-one was the worst in both models and the
Mixed was the best. The different approaches consider different instances as Positive or
Negative, which has different effects on the number of samples considered to the
different classes. However, all binary mapping approaches lead to unbalanced data. U-
ones achieved the worst AUC scores, indicating that a bigger dataset does not always
performs better than a smaller one. Assuming the Uncertain instances as Positive can
result in providing false positive information to the network, and consequently decrease
model performance. In medical context, it may be more serious to use U-zeros since
this will deliver false negative information to the network. Considering a sick person as
healthy is more serious than the opposite.
For the lateral view images, just as for the frontal view ones, Model 3 and the
Mixed approach provided the best results. The lateral view images achieved a better
AUC than the frontal view, which was not expected since there were fewer samples.
Also, the lateral view of X-rays did not allow as good visualization as the frontal view
once they show a smaller area of the chest and the lungs are overlapping. A very similar
study for the detection of chest diseases, by Rubin et al. [30] also found a better AUC
in lateral chest x-rays than in frontal chest x-rays.
Table 9. Average of the best overall results for training and validation. The same
CheXpert AUC was used for train and validation because the training values were not specified.
Phase Average AUC Average CheXpert AUC Diff
Train 0.9662 0.9056 +0.0610
Validation 0.7429 −0.1627
Comparing our training results with CheXpert paper yielded nearly the same
results, but the validation results are quite divergent (Table 9). On October 16, 2019,
the final position in the leaderboard of CheXpert Challenge achieved an AUC of
72.70% and our best results achieved 74.29%. The results of the CheXpert paper were
used as benchmark. It was not the goal of this work to make an absolute comparison of
our results with their results, but to compare how different DL techniques perform in
classifying chest X-rays.
6 Conclusions
Several CNN architectures, hyperparameters and labelling metrics were tested. The
best performing architecture was achieved using a transfer learning technique. The first
two convolution layers of the CNN were initialized with weights from a VGGNet
model pre-trained on ImageNet. Transfer learning has proven to be a good choice, as
the CheXpert dataset has a small dimension. It allowed the network to start the training
process with prior knowledge of important features, even coming from another domain
(i.e. ImageNet dataset).
Artificial intelligence has potential to facilitate or even replace the diagnosis of
patients by performing tasks such as detection, qualification and quantification. The use
of artificial intelligence in medical imaging will allow the professionals, to spend more
time communicating with patients or deliberating with colleagues. They will not be
overloaded with the amount of medical exams to be analysed. This work pretends to be
a contribution in using DL techniques for performing multilabel classification in
medical imaging analysis, as it is possible to identify findings and detect patterns
efficiently. Medical professionals should be allowed to use their time to treat patients
and not waste their time treating medical images.
Acknowledgements. This work has been supported by FCT – Foundation for Science and
Technology within the Project Scope: UIDB/00319/2020. We gratefully acknowledge the sup-
port of the NVIDIA Corporation with their donation of a Quadro P6000 board used in this
research.
References
1. Hill, D.L., Batchelor, P.G., Holden, M., Hawkes, D.J.: Medical image registration. Phys.
Med. Biol. 46, 1–45 (2001)
2. NHS England. Diagnostic Imaging Dataset. https://www.england.nhs.uk/statistics/statistical-
work-areas/diagnostic-imaging-dataset/. Accessed 17 Apr 2019
3. Benseler, J.: A pocket guide to medical imaging. In: The Radiology Handbook. Ohio
University Press, Ohio (2006)
4. Waseda, Y., Matsubara, E., Shinoda, K.: X-ray Diffraction Crystallography: Introduction,
Examples and Solved Problems. Springer, Heidelberg (2011)
5. Ballard, D., Sklansky, J.: Tumor detection in radiographs. Comput. Biomed. Res. 6, 299–
321 (1973)
6. Lakhani, P., Sundaram, B.: Deep learning at chest radiography: automated classification of
pulmonary tuberculosis by using convolutional neural networks. Radiology 284, 574–582
(2017)
7. Stephen, O., et al.: An efficient deep learning approach to pneumonia classification in
healthcare. J. Healthcare Eng. 2019, 7 (2019)
8. Liu, C., et al.: TX-CNN: detecting tuberculosis in chest X-ray images using convolutional
neural network. In: 2017 IEEE International Conference on Image Processing (ICIP). IEEE
(2017)
9. Yates, E.J., Yates, L.C., Harvey, H.: Machine learning “red dot”: open-source, cloud, deep
convolutional neural networks in chest radiograph binary normality classification. Clin.
Radiol. 73(9), 827–831 (2018)
10. Yaniv, B., Diamant, I., Wolf, L., Lieberman, S., Konen, E., Greenspan, H.: Chest pathology
detection using deep learning with non-medical training. In: IEEE 12th International
Symposium on Biomedical Imaging (ISBI), New York (2015)
11. Pan, I., Agarwal, S., Merck, D.: Generalizable inter-institutional classification of abnormal
chest radiographs using efficient convolutional neural networks. J. Digit. Imaging 32, 888–
896 (2019)
12. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with
region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)
13. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmen-
tation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651 (2017)
14. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. In: ICLR, vol. 6 (2015)
15. Ting, D., Cheung, C., et al.: Development and validation of a deep learning system for
diabetic retinopathy and related eye diseases using retinal images from multiethnic
populations with diabetes. JAMA 318, 2211–2223 (2017)
16. Esteva, A., Kuprel, B., Novoa, R., Ko, J., Swetter, S., Blau, H., Thrun, S.: Dermatologist-
level classification of skin cancer with deep neural networks. Nature 542, 115–118 (2017)
17. Ridley, E.: AI outperforms physicians for interpreting chest x-rays. Aunt Minnie (2019)
18. Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo,
B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D., Halabi, S., Sandberg, J., Jones, R.,
Larson, D., Langlotz, C., Patel, B., Lungren, M., Ng, A.: CheXpert: a large chest radiograph
dataset with uncertainty labels and expert comparison. Association for the Advancement of
Artificial Intelligence (2019)
19. Clinger, N., Hunter, T., Hillman, B.: Radiology reporting: attitudes of referring physicians.
In: RSNA 1988 Annual Meeting (1988)
20. European Society of Radiology (ESR): Good practice for radiological reporting. Guidelines
from the European Society of Radiology (ESR). Insights Imaging 2, 93–96 (2011)
21. Stanford ML Group: CheXpert: A Large Chest X-Ray Dataset and Competition,
Stanford ML Group. https://stanfordmlgroup.github.io/competitions/chexpert/. Accessed
16 Oct 2019
22. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings
of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
23. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M.,
Adam, H.: MobileNets: efficient convolutional neural networks for mobile vision
applications, arXiv preprint arXiv:1704.04861 (2017)
24. Nain, A.: Beating everything with Depthwise Convolution. https://www.kaggle.com/
aakashnain/beating-everything-with-depthwise-convolution. Accessed 02 July 2019
25. ImageNet, Large Scale Visual Recognition Challenge (ILSVRC). http://www.image-net.org/
challenges/LSVRC/. Accessed 18 May 2019
26. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception
architecture for computer vision. In: The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pp. 2818–2826 (2016)
27. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition
and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition
(2015)
28. Chen, L.-C., et al.: Encoder-decoder with atrous separable convolution for semantic image
segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV)
(2018)
29. Wang, S.-H., et al.: Classification of Alzheimer’s disease based on eight-layer convolutional
neural network with leaky rectified linear unit and max pooling. J. Med. Syst. 42(5), 85
(2018)
30. Rubin, J., et al.: Large scale automated reading of frontal and lateral chest x-rays using dual
convolutional neural networks. arXiv preprint arXiv:1804.07839 (2018)
A Thermodynamic Assessment of the Cyber
Security Risk in Healthcare Facilities
Filipe Fernandes1 , Victor Alves2 , Joana Machado3 ,

Filipe Miranda4 , Dinis Vicente5 , Jorge Ribeiro2 ,
Henrique Vicente6,7 , and José Neves1,7(&)
1
CESPU – Escola Superior de Saúde do Vale do Ave,
Vila Nova de Famalicão, Portugal
[email protected]
2
Escola Superior de Tecnologia e Gestão, ARC4DigiT – Applied Research
Center for Digital Transformation, Instituto Politécnico de Viana do Castelo,
Viana do Castelo, Portugal
3
Farmácia de Lamaçães, Braga, Portugal
[email protected]
4
Administração Tributária e Aduaneira, Viana do Castelo, Portugal
[email protected]
5
Escola Superior de Tecnologia e Gestão de Leiria,
Instituto Politécnico de Leiria, Leiria, Portugal
[email protected]
6
Departamento de Química, Escola de Ciências e Tecnologia,
REQUIMTE/LAQV, Universidade de Évora, Évora, Portugal
[email protected]
7
Centro Algoritmi, Universidade do Minho, Braga, Portugal
[email protected]
Abstract. Over the last decades a number of guidelines have been proposed for
best practices, frameworks, and cyber risk assessment in present computational
environments. In order to improve cyber security vulnerability, in this work it is
proposed and characterized a feasible methodology for problem solving that
allows for the evaluation of cyber security in terms of an estimation of its
entropic state, i.e., a predictive evaluation of its risk and vulnerabilities, or in
other words, the cyber security level of such ecosystem. The analysis and
development of such a model is based on a line of logical formalisms for
Knowledge Representation and Reasoning, consistent with an Artificial Neural
Networks approach to computing, a model that considers the cause behind the
action.
Keywords: Entropy Cyber security Logic Programming Knowledge

Representation and Reasoning Artificial Neural Networks
A Thermodynamic Assessment of the Cyber Security Risk 453
1 Introduction
Security and privacy are becoming more and more challenging issues, despite the
enormous efforts to combat cybercrime and cyber terrorism. Cybersecurity is concerned
with the security of data and the applications and infrastructure used to store, process
and transfer data. It is understood to be a process of protecting data and information by
preventing, detecting, and responding to cybersecurity events. Such events, which
include intentional attacks and accidents, are changes that may affect organizational
processes [1–3]. Once many of these technologies are wireless and therefore depend on
custom protocols and encryption platforms, it is even more of concern that action plans
need to be developed that address responses to potential cyber-attacks on Organiza-
tional Services, Infrastructure, Information and Communication Technology (ICT), i.e.,
structures that evolve as a set-up in terms of hardware, software, data and the people
who use them [1, 4].
Several guidelines have been issued for the design, management and monitoring of
Technological Security Infrastructures, that includes ISO 27001 [5], COBIT Infor-
mation and Technology Control Goals [6], and directives such as ITIL (Information
Technology Infrastructure Library) [7, 8]. In terms of best practices, frameworks, and
cyber risk assessment one may take an account from the Financial Industry Regulatory
Authority [9], Frameworks of the Cybersecurity Framework of the USA National
Institute of Standards and Technology (NIST) [10], SANS Critical Security Controls
for Effective Cyber Defense [11], ISO 27032 (Security Techniques - Cybersecurity
Guidelines) [12], or the Cyber Security Risk Assessment (CSRA) [13–18]. In another
aspect, a risk assessment framework defines the rules for evaluating the persons to be
included, the terminology for discussing the risk, the criteria for quantifying, quali-
fying, and comparing the risk levels and the required documentation based on
assessments and follow-up activities.
A framework is designed to establish an objective measure of risk that enables a
business to understand the business risk for critical information and assets both qual-
itatively and quantitatively. Finally, the risk assessment framework provides the nec-
essary tools to make business decisions regarding investment in people, processes and
technologies to bring the risk to an acceptable level.
Three of the most prevalent risk frameworks in use today are OCTAVE (Opera-
tionally Critical Threat Asset, Vulnerability Evaluation) [19], and the NIST risk
assessment [10]. Others frameworks that have a substantial following are ISACA’s
RISK IT (part of COBIT) [6], and ISO 27005:2008 (part of the ISO 27000 series that
includes ISO 27001 and 27002) [5, 12]. All the frameworks have similar approaches
but differ in their high level goals. OCTAVE, NIST, and ISO 270xx focus on security
risk assessments, while RISK IT applies to the broader IT risk management space.
The case study considers the use of Logic Programming (LP) for Knowledge Rep-
resentation and Reasoning (KRR) [20], and Artificial Neural Networks (ANNs) as a
natural way of computing [21, 22]. On the other hand, data is embedded and transformed
according to the Laws of Thermodynamics [23, 24] to capture either the key components
affecting cyber security environments or human behavior such as attitudes, motivations
454 F. Fernandes et al.
and habits, i.e., the human factors that characterize each CSRA. Finally, in the last
section, the main conclusions are drawn and the future work is outlined.
2 Fundamentals
This paper uses the references to the Open Web Application Security Project (OWASP)
on improving the security of software [25, 26], which evolves according to the steps,
viz.
• Step 1 – Identifying the risk. It should identify a security risk that needs to be
evaluated. The tester must collect information about the threat agent involved, the
attack used, the vulnerability, and the impact of a successful exploit on the business;
• Step 2 – Factors for estimating likelihood. There are a number of factors that can
determine the likelihood. The first set is related to the threat agent involved; the goal
is to estimate the likelihood of a successful attack from a group of potential
attackers. The second one stands for the vulnerability factors and is related to the
weakness; the aim is to estimate the likelihood with which the respective openness
is discovered and exploited;
• Step 3 – Factors for estimating impact. When considering the effects of a successful
attack, it is important to realize that there are two types of impact. The former stands
for the “technical implications” for the application, the data used and the functions
provided. The latter refers to the “impact” on the company and how it executes the
application;
• Step 4 – Determining the severity of the risk. There are typically two methods, the
informal and the repeatable ones. In the former and in many environments, there is
nothing wrong with checking the factors and simply grasping the answers. In the
later, the tester should think through the factors and identify the key “driving”
factors that control the outcome; and
• Step 5 – Studying the changes in temperature, pressure, and volume on physical
systems on the macroscopic scale by analyzing the collective motion of their par-
ticles through observation and statistics.
The next lines describe the designed approach, which focuses on Thermodynamics
to describe Knowledge Representation and Reasoning (KRR) practices as a process of
energy devaluation [24, 25]. To understand the basics of the proposed approach,
consider the first two laws of Thermodynamics. The former one describes the energy
saving, i.e., for an isolated system, the total amount of energy is constant. Energy can
be converted, but not generated or destroyed. The latter describes entropy, a property
that quantifies the state of order of a system and its development. These characteristics
fit our vision when Knowledge Representation and Reasoning (KRR) practices are
understood as a process of energy devaluation. Indeed, it is understood that a data
element is in an entropic state, the energy of which can be broken down and used in the
sense of devaluation, but never used in the sense of destruction, viz.
• exergy, sometimes referred to as available energy, or more specifically, as available

work, is that part of the energy that can be arbitrarily used after a transfer process, or
in other words, its entropy. In Fig. 1, this is given by the dark colored areas;
• vagueness, i.e., the corresponding energy values that may or may not have been
transferred and consumed. In Fig. 1 are given by the gray colored areas; and
• anergy, that stands for an energetic potential that was not yet transferred and
consumed, being therefore available, i.e., all the energy that is not an exergy. In
Fig. 1 it is given by the white colored areas.
which designate all possible energy’s transfer operations as pure energy. Aiming at the
quantification of the qualitative information and in order to make the process com-
prehensible, it will be presented graphically (Fig. 2). Taking as an example a group of
four questions that correspond to the Threat Agent Factors Questionnaire-Four–Item
(TAFQ – 4), which was designed to assess the Open Web Application Security Project
(OWASP) [10], viz.
Q1 – How technically skilled is this group of threat agents?
Q2 – How are IT staff qualified to work in security/cybersecurity to gain knowledge
about procedures and experience?
Q3 – What skills do workers have who enter and leave the company? and
Q4 – Which skills qualify IT staff to assess the OWASP in improving the security of
software?
This questionnaire is designed to evaluate the IT security assessment based on work

found in [11, 12, 18], and OWASP general feelings about security considerations, on
the assumption that high cyber security scale indexes will cause positive outcomes and
benefits [22]. The scale used is made upon the terms, viz.
Security penetration skills (5), Network and programming skills (4), Advanced computer users
(3), Some technical skills (2), No technical skills at all (1), Some technical skills (2), Advanced
computer users (3), Network and programming skills (4), Security penetration skills (5)
which allows one to capture the entropic variations that occur in the system. Moreover,
it is included a neutral term, neither agree nor disagree with the IT security appraisal,
which stands for uncertain or vague. The reason for the individual’s answers is in
relation to the query, viz.
As an individual, how much would you agree with the valuation of each individual
answer to the TAFQ-4 referred to above?
In order to create a comprehensible process, the related energy properties are
graphically displayed. For purposes of illustration and simplicity, full calculation
details are provided for TAFQ – 4’s answers. Therefore, consider Table 1 as the result
of an individual answer to TAFQ – 4. For example, the answer to Q1 was Advanced
computer users and Some technical skills, in that order, i.e., it is stated that the person’s
answer was Advanced computer users, but he/she does not reject the possibility that the
answer may be Some technical skills in certain situations. It shows a trend in the
development of the system with an added entropy, i.e., there is a degradation of system
performance. On the other hand, the answer to Q2, Advanced computer users and
Network and programming skills, shows a trend in the development of the system with
a decrease in entropy, i.e., there is an increase in system performance. Figure 1
describes such answers regarding the different forms of energy, i.e., exergy, vagueness
and anergy. Considering that the markers on the axis correspond to one of the possible
scale options, each system behaves better as entropy decreases, which is the case with
respect to Q2, whose entropic states are evaluated as untainted energy (in the form of
Best/Worst case scenarios) as shown in Table 2.
Table 1. One-person answers to TAFQ – 4 questions.
Scale
Questions
(5) (4) (3) (2) (1) (2) (3) (4) (5) vagueness
Q1 × ×
Q2 × ×
Q3 ×
Q4 ×
Leading to Leading to
Fig. 1
1 1 1 1
π π π π
(1) (1) (1) (1) (1) (1) (1) (1)
(2) (2) (2) (2) (2) (2) (2) (2)

(3) (3) (3) (3) (3) (3) (3) (3)
(4) (4) (4) (4) (4) (4) (4) (4)
Q4 (5) (5) Q1 Q1 (5) (5) Q2 Q2 (5) (5) Q3 Q3 (5) (5) Q4
Q3 Q2 Q4 Q3 Q1 Q4 Q2 Q1
Fig. 2
Fig. 1. Estimation of the energy consumed in relation to a person’s answers to TAFQ – 4

questions.
1 1 1
π π π
(1) (1) (1) (1) (1) (1)
(2) (2) (2) (2) (2) (2)

(3) (3) (3) (3) (3) (3)
(4) (4) (4) (4) (4) (4)
Q4 (5) (5) Q1 Q4 (5) (5) Q1 Q4 (5) (5) Q1
Q3 Q2 Q3 Q2 Q3 Q2
Table 2
Fig. 2. A graphical representation of the energy consumed in terms of a single answer to the
TAFQ – 4 questions.
The data collected above may now be structured in terms of the extent of predicate
threat agent factors questionnaire (tafq – 4) in the form, viz.
tafq4 : EXergy; VAgueness; Cyber - Security - Risk - Assessment;

Qualityof Information ! fTrue; Falseg
a construct that speaks for itself, whose extent and formal description follows (Table 3
and Program 1).
{
}
Program 1. The extent of the tafq – 4 predicate for the best case scenario.
The evaluation of CSRA and QoI for the different items that make the TAFQ – 4
are now given in the form, viz.
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
• CSRA is figured out using CSRA ¼ 1 ES2 (Fig. 3), where ES stands for an
exergy’s that may have been consumed in the Best-case scenario (i.e.,
ES ¼ exergy þ vagueness), a value that ranges in the interval 0…1.
Table 2. Evaluation of the Best and Worst-case scenarios for the TAFQ – 4 questions regarding
their entropic states.
Table 3
Table 3. The extent of the tafq – 4’s predicate from a person’s answers to TAFQ – 4.
Questionnaire Ex VA CSRA QoI EX VA CSRA QoI
BCS BCS BCS BCS WCS WCS WCS WCS
BIFQ – 6 0.29 0.36 0.76 0.35 0.66 0 0.75 0.33
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
CSRA ¼ 1 ð0:29 þ 0:36Þ2 ¼ 0:76
• QoI is evaluated in the form, viz.
1
CSRA
?
0 1
ES
Fig. 3. CSRA evaluation.
QoI ¼ 1 ðexergy þ vaguenessÞ=Interval lengthð¼ 1Þ ¼

¼ 1 ð0:29 þ 0:36Þ ¼ 0:35
3 Case Study
To complement the process of data collection, and to add the possibility of assessing
CSRA factors on a more comprehensive enlargement, it was considered three more
questionnaires, namely the one entitled as the Cyber Security Questionnaire-Four-Item
(CSQ – 4), which was designed to assess individual differences in the proneness to take
risks when using the information technology infrastructure. It is given in the form, viz.
Q1 – How may an exploit to be detected?
Q2 – How the organization takes countermeasures to block these attempts?
Q3 – How are you aware of the difference between Symmetric and Asymmetric
encryption? and
Q4 – How are you aware of the difference of how to protect your home Wireless
Access Point?
For this questionnaire the answer scale was confined to the following options, viz.
Very effectively (4), effectively (3), ineffectively (2), ineffectively at all (1), ineffectively (2),
effectively (3), Very effectively (4)
Moreover, it is included a neutral term, neither agree nor disagree with the IT
security appraisal, which stands for uncertain or vague. The reason for the individual’s
answers is in relation to the query, viz.
answer to the CSQ-4 referred to above?
Another questionnaire is entitled as Self-Assessment and Identity Questionnaire-
Three-Item (SAIQ – 3), and is set as follows, viz.
Q1 – How much data could be disclosed and how sensitive is it?
Q2 – How prevention of unauthorized access and use are you conscientious? and
Q3 – How much encrypted backups have you done to storage the Information?
Finally, the questionnaire to determine the Business Impact Factors Questionnaire-

Six-Item (BIFQ – 6) with regard to their current journey, is set as follows, viz.
Q1 – How much financial damage will result from an exploit?
Q2 – Would much exploit result in reputation damage that would harm the
business?
Q3 – How much exposure does non-compliance introduce?
Q4 – How much personally identifiable information could be disclosed?
Q5 – How many times do you measure annual losses from fraudulent business
transactions? and
Q6 – How much solutions do you provide to your customers to help them avoid
being victimized by fraud?
For these two questionnaires the answer scale was confined to the following
options, viz.
Very much indeed (4), much (3), not too much (2), not much at all (1), not too much (2), much
(3), Very much indeed (4)
Moreover, it is included a neutral term, neither agree nor disagree with the IT
security appraisal, which stands for uncertain or vague. The reason for the individual’s
answers is in relation to the query, viz.
answer to the SAIQ - 3 and BIFQ - 6 referred to above
Table 4. Individual answers to the CSQ – 4, SAIQ – 3 and TAFQ – 4 questionnaires.
Scale
Questionnaire Questions
(4) (3) (2) (1) (2) (3) (4) vagueness
Q1 ×
CSQ – 4
Q2 × ×
Q3 ×
Q4 ×
Q1 ×
SAIQ – 3 Q2 × ×
Q3 ×
Q1 × ×
Q2 × ×
BIFQ – 6 Q3 ×
Q4 ×
Q5 × ×
Q6 ×
Table 5
Table 5. The threat agent factors questionnaire (tafq – 4), cyber security questionnaire (csq –
4), self-assessment and identity questionnaire (saiq – 3) and business impact factors
questionnaire (bifq – 6) predicates’ scopes obtained according to the individual answers to the
TAFQ – 4, CSQ – 4, SAIQ – 3, and BIFQ – 6 questionnaires.
Questionnaire Exergy Vague CSRA QoI Exergy Vague CSRA QoI
BCS BCS BCS BCS WCS WCS WCS WCS
TAFQ – 4 0.29 0.36 0.76 0.35 0.66 0 0.75 0.33
CSQ – 4 0.28 0.42 0.71 0.30 0.75 0 0.66 0.25
SAIQ – 3 0.22 0.37 0.81 0.41 0.65 0 0.47 0.12
BIFQ – 6 0.34 0.12 0.93 0.54 0.78 0 0.68 0.27
To complement Table 1 and Table 4 shows the answers of a single individual to

CSQ – 4, SAIQ – 3 and BIFQ – 6 questionnaires. The computational process for each
questionnaire is the same as the one for TAFQ – 4. Table 5 describes the predicates
extent obtained according to each questionnaire answers.
4 Computational Make-Up
The following describes nothing less than a mathematical logic program that, through
insights that are subject to formal proof, allows one to understand and even adapt the
actions and attitudes of individuals or groups and toward them the organization as a
whole, i.e., assess the impact on the functioning and performance of the organization
through logical inference. A system that is not programmed for specific tasks; rather, it
is told what it needs to know and is expected to infer the rest. It is now possible to use
this data to train an Artificial Neural Network (Fig. 4) in order to get on the fly
assessments of the cyber security assessment [22, 23].
{
Program 2. The make-up of the logic program or knowledge base for a user answer.
where ¬ denotes strong negation and not stands for negation-by-failure. It is now
possible to use this data to train an Artificial Neural Network (ANN) [22, 23] (Fig. 4) in
order to get on the fly an evaluation of the Cyber Security Risk Assessment (CSRA),
plus a measure of its Sustainability; indeed, the ANN approach to data processing
enables one to process the data in relation to a system context. For example, to an
enterprise with 30 (thirty) users, the training set may be gotten by making obvious the
theorem, viz.
8ðEX1 ; VA1 ; CSRA1 ; QoI1 ; ; EX4 ; VA4 ; CSRA4 ; QoI4 Þ;

ðtafq4ðEX1 ; VA1 ; CSRA1 ; QoI1 Þ;
; bifq6ðEX4 ; VA4 ; CSRA4 ; QoI4 ÞÞ
in every way possible, i.e., generating all the different possible sequences that combine
the extents of the predicates tafq – 4, scq – 4, saiq – 3 and bifq – 6, viz.
fftafq4ðEX1 ; VA1 ; CSRA1 ; QoI1 Þ; csq4ðEX2 ; VA2 ; CSRA2 ; QoI2 Þ;

saiq3ðEX3 ; VA3 ; CSRA3 ; QoI3 Þ; bifq6ðEX4 ; VA4 ; CSRA4 ; QoI4 Þg; g
fftafq4ð0:29; 0:36; 0:76; 0:35Þ; csq4ð0:28; 0:42; 0:71; 0:30Þ;
saiq3ð0:22; 0:37; 0:81; 0:41Þ; bifq6ð0:34; 0:12; 0:93; 0:54Þg; g
In terms of the output of the ANN, it is considered the evaluation of the CSRA, i.e.,
its implication on System Performance, which may be weigh up in the form, viz.
0.80 0.40
Cyber Security Risk
Assessment (CSRA) CSRA Sustainability ⇔ QoI
Output Layer
Hidden Layer Bias
Input Layer Bias
Pre-processing
Layer
0.29
0.36
0.71
0.28
0.42
0.72
0.22
0.37
0.78
0.34
0.12
0.66
tafq – 4 csq – 4 saiq – 3 bifq – 6
Fig. 4. An abstract view of the topology of the ANN for assessing Cyber Security Risk.

CSRAtafq4 þ CSRAcsq4 þ CSRAsaiq3 þ CSRAbifq6 =4 ;
ffð0:76 þ 0:71 þ 0:81 þ 093Þ=4 ¼ 0:80g; g
and, viz.

QoItafq4 þ QoIcsq4 þ QoIsaiq3 þ QoIbifq6 =4 ;
ffð0:35 þ 0:30 þ 0:41 þ 0:54Þ=4 ¼ 0:40g; g
This study focused on the human factors that characterize each CSRA and how their
perception of such factors contributes to CSRA vulnerability. As future work, and
considering how social factors may shape the cyber security perception of the envi-
ronment, we intend to look at ways to figure out the environment according to the cyber
security modes. We will also consider the implementation of new questionnaires and
the process of collecting data from a wider audience.
Acknowledgments. This work has been supported by FCT – Fundação para a Ciência e Tec-
nologia within the R&D Units Project Scope: UIDB/00319/2020.
References
1. Zhang, K., Ni, J., Yang, K., Liang, X., Ren, J., Shen, X.: Security and privacy in smart city
applications: challenges and solutions. IEEE Commun. Mag. 55(1), 122–129 (2017)
2. Khatoun, R., Zeadally, S.: Cybersecurity and privacy solutions in smart cities. IEEE
Commun. Mag. 55(3), 51–59 (2017)
3. Gaur, A., Scotney, B., Parr, G., McClean, S.: Smart city architecture and its applications
based on IoT. Procedia Comput. Sci. 52, 1089–1094 (2015)
4. Ijaz, S., Shah, M., Khan, A., Mansoor, A.: Smart cities: a survey on security concerns. Int.
J. Adv. Comput. Sci. Appl. 7(2), 612–625 (2016)
5. ISO/IEC 27001 Information security management. https://www.iso.org/isoiec-27001-
information-security.html. Accessed 19 Nov 2019
6. COBIT: Information Systems Audit and Control Association, Control Objectives for
Information and Related Technology, 5th edn. IT Governance Institute (2019)
7. OGC: Official Introduction to the ITIL Service Lifecycle, Stationery Office, Office of
Government Commerce. https://www.itgovernance.co.uk. Accessed 23 Nov 2019
8. Armin, A., Junaibi, R., Aung, Z., Woon, W., Omar, M.: Cybersecurity for smart cities: a
brief review. Lecture Notes in Computer Science, vol. 10097, pp. 22–30 (2017)
9. Financial Industry Regulatory Authority: Financial Industry Regulatory Practices. https://
www.finra.org/file/report-cybersecurity-practices. Accessed 22 Nov 2019
10. National Institute of Standards and Technology: Cybersecurity Framework. https://www.
nist.gov/sites/default/files/documents/cyberframework/cybersecurity-framework-021214.pdf
. Accessed 22 Nov 2019
11. SANS Institute: Critical Security Controls for Effective Cyber Defense. https://www.sans.
org/critical-security-controls. Accessed 22 Nov 2019
12. ISO 27032 - Information technology – Security techniques – Guidelines for cybersecurity.
https://www.iso.org/standard/44375.html. Accessed 22 Nov 2019
13. Liu, C., Tan, C.-K., Fang, Y.-S., Lok, T.-S.: The security risk assessment methodology.
Procedia Eng. 43, 600–609 (2012)
14. Lanz, J.: Conducting information technology risk assessments. CPA J. 85(5), 6–9 (2015)
15. Tymchuk, O., Iepik, M., Sivyakov, A.: Information security risk assessment model based on
computing with words. MENDEL Soft Comput. J. 23, 119–124 (2017)
16. Amini, A., Norziana, J.: A comprehensive review of existing risk assessment models in
cloud computing. J. Phys: Conf. Ser. 1018, 012004 (2018)
17. European Union Agency for Network and Information Security (ENISA). https://www.
smesec.eu. Accessed 22 Nov 2019
18. Ribeiro, J., Alves, V., Vicente, H., Neves, J.: Planning, managing and monitoring
technological security infrastructures. In: Machado, J., Soares, F., Veiga, G. (eds.)
Innovation, Engineering and Entrepreneurship. Lecture Notes in Electrical Engineering,
vol. 505, pp. 10–16. Springer, Cham (2019)
19. Caralli, R.A., Stevens, J.F., Young, L.R., Wilson, W.R.: Introducing OCTAVE Allegro:
improving the information security risk assessment process. Technical report CMU.
Software Engineering Institute (2007)
20. Neves, J.: A logic interpreter to handle time and negation in logic databases. In: Muller, R.,
Pottmyer, J. (eds.) Proceedings of the 1984 Annual Conference of the ACM on the 5th
Generation Challenge, pp. 50–54. Association for Computing Machinery, New York (1984)
21. Cortez, P., Rocha, M., Neves, J.: Evolving time series forecasting ARMA models.
J. Heuristics 10, 415–429 (2004)
22. Fernández-Delgado, M., Cernadas, E., Barro, S., Ribeiro, J., Neves, J.: Direct Kernel
Perceptron (DKP): ultra-fast kernel ELM-based classification with non-iterative closed-form
weight calculation. J. Neural Netw. 50, 60–71 (2014)
23. Wenterodt, T., Herwig, H.: The entropic potential concept: a new way to look at energy
transfer operations. Entropy 16, 2071–2084 (2014)
24. Neves, J., Maia, N., Marreiros, G., Neves, M., Fernandes, A., Ribeiro, J., Araújo, I., Araújo,
N., Ávidos, L., Ferraz, F., Capita, A., Lori, N., Alves, V., Vicente, N.: Entropy and
organizational performance. In: Pérez García, H., Sánchez González, L., Castejón Limas,
M., Quintián Pardo, H., Corchado Rodríguez, E. (eds.) Hybrid Artificial Intelligent Systems.
Lecture Notes in Computer Science, vol. 11734, pp. 206–217. Springer, Cham (2019)
25. OWASP Open Cyber Security Framework Project. https://www.owasp.org/index.php/
OWASP_Open_Cyber_Security_Framework_Project. Accessed 21 Nov 2019
26. OWASP Risk Rating Methodology. https://www.owasp.org/index.php/OWASP_Risk_
Rating_Methodology. Accessed 21 Nov 2019
How to Assess the Acceptance of an Electronic
Health Record System?
Catarina Fernandes, Filipe Portela(&), Manuel Filipe Santos,

José Machado, and António Abelha
Algoritmi Research Center, University of Minho, Guimarães, Portugal

{cfp,mfs}@dsi.uminho.pt, {jmac,abelha}@di.uminho.pt
Abstract. Being able to access a patient’s clinical data in due time is critical to
any medical setting. Clinical data is very diverse both in content and in terms of
which system produces it. The Electronic Health Record (EHR) aggregates a
patient’s clinical data and makes it available across different systems. Consid-
ering that user’s resistance is a critical factor in system implementation failure,
the understanding of user behavior remains a relevant object of investigation.
The purpose of this paper is to outline how we can assess the technology
acceptance of an EHR using the Technology Acceptance Model 3 (TAM3) and
the Delphi methodology. An assessment model is proposed in which findings
are based on the results of a questionnaire answered by health professionals
whose activities are supported by the EHR technology. In the case study sim-
ulated in this paper, the results obtained showed an average of 3 points and
modes of 4 and 5, which translates to a good level of acceptance.
Keywords: Technology Acceptance Model Technology assessment

Electronic Health Record Intensive Medicine
1 Introduction
Health information technologies, such as the Electronic Health Record (EHR), and
information management are fundamental in transforming the health care industry [4].
The flow of information in any hospital environment can be characterized as highly
complex and heterogeneous. Its availability across systems in due time is critical to the
success of clinical processes. Thus, the implementation and use of information systems
that aggregate patient data can facilitate the work of health professionals and maximize
their productivity. However, this is only possible if the system is fully accepted by its
users.
This paper aims to outline how the level of acceptance of an EHR can be assessed
through the combination of the Technology Acceptance Model (TAM) and the Delphi
methodology. A simulation was performed through the application of these method-
ologies in a case study that evaluates the level of acceptance of the EHR used in the
Intensive Care Unit (ICU) of Centro Hospitalar do Porto (CHP). The assessment is
based on the application of a questionnaire and subsequent statistical analysis of the
results. The results were produced by an algorithm that generated responses to the
questionnaire according to the characteristics of the questions. The simulation was
How to Assess the Acceptance of an Electronic Health Record System? 467
designed to represent various possible results and outline how its analysis can be
performed. The use of a simulated environment also ensured data integrity and
anonymity. The analysis process was optimized to facilitate its replication in a realistic
scenario, where the questionnaire should be answered by health professionals whose
activities are supported by the EHR technology. The replication of the assessment
model proposed will allow to evaluate the level of user acceptance, to identify the
factors that influence health professionals’ resistance to the EHR and to put forward a
set of improvements which will increase user acceptance.
This paper is composed of five sections. The first section introduces the study. The
second section defines relevant concepts. The third section presents the assessment
model proposed. The fourth section describes the application of the model in a case
study. Finally, the conclusions are presented in the fifth section.
2 Background
2.1 Intensive Medicine
Intensive Medicine is a multidisciplinary field in health care with focus on the pre-
vention, diagnostic and treatment of patients with dysfunction or failure of one or more
organs, particularly respiratory and cardiovascular systems [1, 9]. These patients are
admitted to Intensive Care Units (ICU), which are specially prepared to continuously
monitor vital functions and offer mechanical or pharmacological support [1, 3]. Due to
the high complexity and severity of the cases handled in the ICU, it is essential that
health professionals make the right decisions in a timely manner. However, the
decision-making process can be hindered by the extensive amount of data generated
across different systems and hospital services.
2.2 Electronic Health Record

The documentation of clinical information regarding a patient is one the of the major
day-to-day activities performed by a health professional. This information can include
biometrical data, prescriptions, imaging and lab test results, among others [7]. The
health record of a patient includes records of all their encounters with all caregivers
across all health providers linked to the health records system. When this data is
gathered electronically, it is designated as Electronic Health Record (EHR) [14].
The EHR is commonly used in health care to aggregate all clinical information and
make it available across different services and units [7]. Furthermore, the EHR directly
impacts the work performance of health professionals as it is the main tool used by
them in the decision-making process.
2.3 Technology Acceptance Model and Delphi

The Technology Acceptance Model (TAM) is frequently used to assess the user’s
acceptance of a specific technology. The model aims to explain the impact of external
factors in the user’s behaviors and intentions by demonstrating the relationship between
two constructs: Perceived Usefulness (PU) and Perceived Ease of Use (PEOU) [5]. PU
468 C. Fernandes et al.
is “the degree to which a person believes that using a particular system would enhance
his or her job performance”, while PEOU can be defined as “the degree to which a
person believes that using a particular system would be free of effort” [5]. A second
version of this model was proposed to further specify the external variables that
determine the PU. These can be categorized in terms of social influence (subjective
norms) and cognitive instrumental processes (image, job relevance, output quality and
results demonstrability) [17]. Another version was proposed in the same year that
defines the variables that determine the PEOU. These can be divided in anchors
(computer self-efficacy, perceptions of external control, computer anxiety and com-
puter playfulness) and adjustments (perceived enjoyment and objective usability) [15].
More recently, the Technology Acceptance Model 3 (TAM 3) combines all the vari-
ables that determine both constructs (PU and PEOU) and presents new relationships
regarding user experience [16]. TAM 3 is comprised of four constructs: PU, PEOU,
Behavioral Intention (BI) and Use Behavior (UB).
The Delphi methodology is an iterative process of application of questionnaires
used to obtain a consensus regarding a specific matter [11]. This method consists in
collecting and analyzing the results of each questionnaire and, subsequently, creating a
new round of questionnaires based on those results. The process ends once all parties
come to a satisfactory agreement. The participant pool should include field experts with
similar cultural and cognitive levels while also representing different points of view
within the study area [18]. By using this methodology, we can determine and predict a
group’s behaviors, needs and priorities [11].
The assessment of TAM 3 constructs can be achieved through the application of
questionnaires. The combination of this model (quantitative method) with the Delphi
methodology (qualitative method) allows to evaluate the acceptance of a certain
technology while reducing the level of uncertainty and ensuring the presence of
complementary views, which will increase the quality of the results [11].
2.4 Related Work

TAM has been widely used to assess the acceptance of information systems and to
understand and explain user behavior [13]. Some of the most relevant works in the
health care field are presented next.
An assessment of the INTCare system, used in the ICU of CHP, was performed
using the constructs proposed by TAM and a questionnaire-based approach guided by
the Delphi methodology [10]. Through the best (PEOU) and worst (UB) acceptance
results, the study showed that health professionals were satisfied with the technology
implemented in terms of innovation and functionality but complained about the real-
time performance and responsiveness of equipment. Thus, the successful combination
of TAM constructs and Delphi methodology allowed to identify positive and negative
aspects of the system and suggest future improvements.
TAM was also applied on the assessment of the AIDA system in the Pathologic
Anatomy Service of Centro Hospitalar Alto Ave [8]. The results showed that the lower
level of user satisfaction regarding some aspects of the system also lowered their
intention to use it. The study contributed to a better understating of user perception
about using a specific system and suggests that this type of analysis should be per-
formed in other health care systems.
3 Assessment Model
To assess the level of acceptance of an EHR through the combination of TAM and
Delphi, a questionnaire must be designed based on both methodologies. The first step is
to structure the questionnaire in sections. Table 1 shows how sections should be
structured, the motivation behind each group of items and how these should be
evaluated.
A 5-point Likert scale [6] is applied for items designed to evaluate the TAM
Table 1. Questionnaire structure.
Section Goal Evaluation
Level of Understand system user types and assess Answer options are
technological their level of experience regarding dependent on the type of
experience computer use in day-to-day activities question
Overall system Provide an overall view of the system by Likert scale
functioning assessing global characteristics and
functionalities
Technical and Evaluate technical and functional Likert scale
functional characteristics of specific system
characteristics panels/sections
Additional Promote further comments from the Free text field
comments participants
constructs PU, PEOU, BI and UB. This scale allows the participant to specify their
level of agreement with a certain statement [12]. The use of a short 5-point scale, with
two negative values (1, 2), two positive values (4, 5) and a neutral value (3), narrows
the results, avoiding their dispersion and reducing inaccuracy [10].
Considering the structure proposed, a sample of items for each questionnaire sec-
tion is offered in Table 2.
Table 2. Example of items per section.

Section Item Answer Options
Level of technological How often do you require technical support 1 – Always
experience while using a computer? 2 – Often
3 – Sometimes
4 – Rarely
5 – Never
Overall system Meets your needs with speed and quality? 1 – Strongly
functioning disagree
2 – Disagree
3 – Neither agree
nor disagree
4 – Agree
5 – Strongly agree
(continued)
Section Item Answer Options
Technical and Does the image enhance the 1 – Strongly
functional registration/consultation of procedures? disagree
characteristics 2 – Disagree
3 – Neither agree
nor disagree
4 – Agree
5 – Strongly agree
Additional comments In your opinion, what are the major issues Free text field
in the system?
To ensure that each TAM construct is evaluated by at least one item, it is necessary
to show the relationship between questions and constructs. Table 3 shows an example
of how these relations can be represented through a matrix. Each table row should be
read as “Item A evaluates constructs PU and PEOU”.
Table 3. Example of matrix between items and TAM constructs.

Item PU PEOU BI UB
Item A X X – –
Item B X X X –
Item C – X – X
Number of items 2 3 1 1
After obtaining answers to the questionnaire, the results must be analyzed. The
analysis process is divided into two phases: technological experience analysis and
univariate statistical analysis. The first aims to better understand system user types
regarding experience in technology. The second phase consists of several statistical
analyses by participant, item, TAM construct and questionnaire section. Table 4 shows
examples of indicators and metrics that can be used in the analysis.
Table 4. Indicators and metrics by analysis phase.

Phase Indicators Metrics
Technological Percentage of autonomous users (Never, rarely or Percentage
experience analysis sometimes require technical support while using a
computer.)
Univariate Level of acceptance by participant, item, section and Mean, mode
statistical analysis construct
Level of answer dispersion by participant and item Standard
deviation
Level of agreement level by participant and item Correlation
coefficient
The coefficient selected to analyze the level of agreement between answers was
Kendall’s tau [2]. This is a non-parametric correlation coefficient which evaluates the
correlation between two ordinal variables. Negative values (closer to −1) represent a
greater divergence between answers while positive values (closer to 1) mean a greater
level of agreement.
The application of TAM to assess the EHR can also result in a SWOT analysis.
This technique can be used to help identify strengths and weaknesses of the EHR
system, factors/threats that influence user resistance and, subsequently, to put forward a
set of improvements/opportunities which will increase acceptance.
4 Case Study
The evaluation model presented in the previous section was applied to a case study.
The goal was to assess the level of acceptance of the EHR used in the ICU of CHP.
The questionnaire created is composed of 41 items divided into 12 sections. The
first section assesses the level of technological experience of the participants. Section 2
evaluates global characteristics and functionalities of the system. Sections 3 through 11
assess functional and technical characteristics of different panels within the EHR
system, such as: Header, Explorer, Discharge Notes, Problems, Daily Round Checklist,
Procedures, Requests, Appointments and Clinical Research. These sections are eval-
uated by a 5-point Likert scale. Finally, a free text field was provided in the last section
to accommodate additional comments. The relationships between items and TAM
constructs are presented in Table 5.
Table 5. Relationship between items and TAM constructs.

Item PE PEOU BI UB
1. Level of Technological Experience
1.1. What percentage of your daily work entails using a computer? – – – –
1.2. How often do you require technical support while using a – – – –
computer?
1.3. Which activities require you to use a computer most often? – – – –
2. Overall System Functioning
2.1. Allows to efficiently consult information? X X – X
2.2. Allows to efficiently register information? X X – X
2.3. Meets your needs with speed and quality? X – – X
2.4. Allows easy and fast access to other platforms (e.g. ALERT)? X X – X
2.5. Allows secure authentication in the system? X – – –
2.6. Is the interface appealing? – X X –
2.7. Is the information presented enough for decision-making? X – – –
2.8. Is the information adequately placed in the screen? – X – X
2.9. Easy to use? – X X X
(continued)
Item PE PEOU BI UB
2.10. Increases productivity? X X X X
2.11. Facilitates decision-making? X X X X
2.12. Are section/panel titles correct? X X – –
3. Header
3.1. Is MCDT information (upper left corner) relevant? X – – –
3.2. Is patient data enough? X – – –
3.3. Are the hospitalization details (upper right corner) enough? X – – –
3.4. Does the information layout facilitate system use? – X – X
3.5. Is the position of the “Sair” and “Actualizar” buttons adequate? – X – –
3.6. Are all tabs (Alertas, Mensagens, etc.) necessary and relevant? X – – –
4. Explorer
4.2. Is all information necessary and relevant? X – – –
5. Discharge Notes
5.3. Is the number of fields adequate for decision-making? X – – –
5.4. Are all fields necessary and relevant? X – – –
6. Problems
7. Daily Round Checklist
8. Procedures
8.3. Does the image enhance the registration/consultation of X X X –
procedures?
8.4. Does the information layout facilitate decision-making? – X – X
9. Requests
10. Appointments
11. Clinical Research
12. Closing Remarks
12.1. What are your main issues with the system? What – – – –
improvements would you like to see implemented?
Number of items 31 23 5 20
Percentage of total (%) 75,6 56,1 12,2 48,8
5 Results
After generating 100 answers to the questionnaire through an algorithm that generated
responses to the questionnaire according to the characteristics of the questions, the
results were analyzed in two phases: technological experience analysis and univariate
statistical analysis. The first aims to understand the level of experience of the partici-
pants regarding the use of a computer in daily activities. An example is presented in
Table 6. The percentage of autonomous users in this case is 68%, which means the
participants had an acceptable level of experience with the use of a computer. Thus,
any issues with the system would not be the result of technological inexperience by its
users.
Table 6. Example of level of technological experience results.

Item Answer Percentage
1.2. How often do you require technical support while using a Always 16%
computer? Often 16%
Sometimes 23%
Rarely 22%
Never 23%
In the second phase of analysis, different statistical properties were used: mean,
mode, standard deviation and correlation coefficient. A global analysis was performed
by participant and by item. Both analyses showed similar results with an overall
average of 3 points and standard deviation values close to 0. The correlation values in
this analysis were mostly positive, which indicates a good level of agreement among
the participants. Results were also analyzed by construct and section. The global results
from both analyses are aggregated in Table 7. It can be observed that:
• Mean values are close to 3 points;
• Mode values are mostly of 4 and 5 points;
• All TAM constructs have similar results, but the best evaluated was BI with mean of
3,08 and mode of 5;
• Section 4 obtained the best results with mean of 3,98 and mode of 5;
• Section 5 had the lowest level of acceptance with mean of 2,93 and mode of 2.
Table 7. Overview of global analysis.

Construct/Section Global
values
Mean Mode
PU 3,03 5
PEOU 3,03 5
BI 3,08 5
UB 3,03 5
2. Overall System Functioning 3,01 5
3. Header 3,01 4
4. Explorer 3,18 5
5. Discharge Notes 2,93 2
6. Problems 2,97 4
7. Daily Round Checklist 3,09 4
8. Procedures 3,13 5
9. Requests 3,02 4
10. Appointments 3,05 5
11. Clinical Research 2,98 1
6 Conclusion
The assessment model presented in this paper successfully combines the constructs of
TAM3 and the Delphi methodology to evaluate the acceptance of an EHR system.
A structure for the questionnaires is proposed along with examples of possible items
per section and the evaluation scale to be used.
This paper also suggests the type of results analysis that should performed with its
indicators and metrics. The model is then applied to a case study to assess the EHR in
the ICU of CHP. The results obtained by this simulation showed an average of 3 points
and modes of 4 and 5, which translates to a good level of acceptance. The application
of the model in a real-life scenario will help in identifying the factors that influence the
user’s resistance to the system and, then, to put forward a set of improvements which
will increase acceptance.
In the future, the model proposed can be improved and extended as more accep-
tance assessments are performed.
Acknowledges. The work has been supported by FCT – Fundação para a Ciência e Tecnologia
within the Project Scope: UID/CEC/00319/2019.The work has been supported by FCT – Fun-
dação para a Ciência e Tecnologia within the Project Scope DSAIPA/DS/0084/2018.
References
1. Bennett, D., Bion, J.: ABC of intensive care: organisation of intensive care. BMJ 318(7196),
1468–1470 (1999). https://doi.org/10.1136/bmj.318.7196.1468
2. Bolboaca, S.-D., Jäntschi, L.: Pearson versus Spearman, Kendall’s tau correlation analysis
on structure-activity relationships of biologic active compounds. Leonardo J. Sci. 5(9), 179–
200 (2006)
3. Braga, A., Portela, F., Santos, M.F., Machado, J., Abelha, A., Silva, Á., Rua, F.: Step
towards a patient timeline in intensive care units. Procedia Comput. Sci. 64, 618–625 (2015).
https://doi.org/10.1016/j.procs.2015.08.575
4. Chaudhry, B., Wang, J., Wu, S., Maglione, M., Mojica, W., Roth, E., Shekelle, P.G.:
Systematic review impact of health information technology on quality, efficiency, and costs
of medical care. Ann. Intern. Med. 144(10), 742–752 (2006)
5. Davis, F.D.: A technology acceptance model for empirically testing new end-user
information systems: Theory and results. Management, Ph.D. (April), 291 (1986)
6. Johns, R.: Survey question bank: methods fact sheet 1 -likert items and scales. Univ.
Strathclyde 1(March), 1–11 (2010). https://doi.org/10.1108/eb027216
7. Marinho, R., Machado, J., Abelha, A.: Processo Clínico Electrónico Visual. In: INForum
2010 : Actas Do II Simposio de Informatica, May 2014, pp. 767–778 (2010)
8. Novo, A., Duarte, J., Portela, F., Abelha, A., Santos, M.F., Machado, J.: Information systems
assessment in pathologic anatomy service. Adv. Intell. Syst. Comput. 354, 199–209 (2015).
9. Paiva, J., Fernandes, A., Granja, C., Esteves, F., Ribeiro, J., Nóbrega, J., Coutinho, P.: Rede
de referenciação de medicina intensiva. Redes de Referenciação Hospitalar, 1–87 (2016).
https://bit.ly/2UqG7SY
10. Portela, F., Aguiar, J., Santos, M.F., Abelha, A., Machado, J., Rua, F.: Assessment of
technology acceptance in Intensive Care Units. Adv. Intell. Syst. Comput. 279–292 (2013)
https://doi.org/10.4018/ijssoe.2014070102
11. Santos, L.D.D., Amaral, L.: Estudos Delphi com Q-Sort sobre a web – A sua utilização em
Sistemas de Informação. In: Associação Portuguesa de Sistemas de Informação, December,
vol. 13 (2004)
12. Silva, P.M.D.: Modelo De Aceitação De Tecnologia (Tam) Aplicado Ao Sistema De
Informação Da Biblioteca Virtual Em Saúde (Bvs) Nas Escolas De Medicina Da Região
Metropolitana Do Recife (2008)
13. Surendran, P.: Technology acceptance model: a survey of literature. Int. J. Bus. Soc. Res. 2
(4), 175–178 (2012)
14. Tan, J.: E-health Care Information Systems: An Introduction for Students and Professionals.
Wiley, Hoboken (2005)
15. Venkatesh, V.: Determinants of perceived ease of use: integrating control, intrinsic
motivation, and emotion into the technology acceptance model. Inf. Syst. Res. 11(4), 342–
365 (2000). https://doi.org/10.1287/isre.11.4.342.11872
16. Venkatesh, V., Bala, H.: Technology acceptance model 3 and a research agenda on
interventions subject areas: design characteristics, interventions, management support,
organizational support, peer support, technology acceptance model (TAM), technology
adoption, training, User A. Decis. Sci. 39(2), 273–315 (2008)
17. Venkatesh, V., Davis, F.D.: A theoretical extension of the technology acceptance model:
four longitudinal field studies. Manag. Sci. 46, 186–204 (2000)
18. Zackiewicz, M., Salles-Filho, S.: Technological Foresight – Um instrumento para política
científica e tecnológica. Parcerias Estratégicas 6(10), 144–161 (2010)
An Exploratory Study of a NoSQL Database
for a Clinical Data Repository
Francini Hak, Tiago Guimarães, António Abelha,


[email protected], {tsg,mfs}@dsi.uminho.pt,
[email protected]
Abstract. The need to implement a distributed Clinical Data Repository

(CDR) at a healthcare facility, rose in large part due to the high volume of data
and the discrepancy of their sources. Over the years, Relational Database
Management Systems (RDBMS) began to present difficulties in responding to
the needs of various organizations when it comes to manipulating a large
amount of data and to its scalability. Therefore, it was necessary to explore other
techniques to choose the appropriate technology to build the CDR. In this way,
NoSQL emerged as a new type of database that is quite useful to work with
multiple and different types of data. In addition, NoSQL introduces a number of
user-friendly features such as a distributed, scalable, elastic and also fault tol-
erant system. In this way, Oracle NoSQL Database was the NoSQL solution
chosen to develop this case study, using the key-value storage. This article was
motivated to propose a CDR architecture based on Oracle NoSQL Database
functionalities. A one-single node database was deployed for better compre-
hension, in order to enhance their features for future implementation.
Keywords: Clinical Data Repository Key-value NoSQL Oracle NoSQL
1 Introduction
Since 1970, Relational Database Management Systems (RDBMS) has been the dom-
inant model for database management. It has been used in most applications to store
and retrieve data. However, new applications have been requiring a fast and large
amount of data storage due to the advancement of the Internet and the emergence of
distributed computing [1]. Thus, a new type of database called NoSQL has emerged to
try to meet the new challenges.
NoSQL appeared when organizations realized that RDBMs systems had a fault in
terms of scalability, i.e., the adaptability of this system to the growth of resources and
users. RDBMS adopt “scaling up” techniques, vertical scalability, focusing only on
increasing capabilities on a single machine, such as memory or CPU. Instead, NoSQL
databases adopt “scaling out” methods, horizontal scalability, focusing on increasing
the number of machines for better performance.
This new data storage technique has led to many properties and processes of a
traditional database system undergoing a process of change. For example, transactional
An Exploratory Study of a NoSQL Database for a CDR 477
properties required to an RDBMS as ACID process (Atomization, Consistency, Iso-

lation and Durability) are not applied to NoSQL systems due to its strong consistency
and reliability, resulting from the application of a new model called BASE (Basic
Availability, Soft state, Eventual consistency).
Therefore, the main factors that led to the emergence of NoSQL databases were the
strictness of relational databases and, consequently, the inadequacy to store a large
amount of data [2].
Based on previous studies, a need was identified to create a new Clinical Data
Repository that should be able to manage a large amount of data from heterogeneous
sources, derived from existing hospital models of information and clinical knowledge.
In this sense, this article was motivated to propose an architecture for the new CDR
based on Oracle’s NoSQL solution, using key-value storage.
This paper is structured in 6 sections. The first and current section exposes the
purpose and objectives to be achieved with this document. The second section
describes the main concepts involved in this article. The third highlight the method-
ology and tools used. The proposed architecture is presented on fourth section. The
section number five discusses the work developed. Finally, in the last section, final
considerations are presented on the study developed so far.
2 Background
2.1 NoSQL Database

NoSQL stands for “Not only SQL” because it represents, more accurately, an approach
that combines non-relational databases with the use of relational databases [3]. NoSQL
belongs to a group of non-relational data management systems which becomes very
useful when it is required to work with a large amount of data or when that same data
does not need a relational model and does not need to follow a fixed structure [4].
Sharding technique and horizontal scaling are two characteristics that can be found
in NoSQL. The sharding technique consists in the process of storing data on multiple
machines which becomes crucial whereas, with the growth of the data, one machine
may not be enough for storage or may not correspond to the performance expected [5].
About horizontal scaling, that corresponds to the addition of more machines or setting
up a cluster for the software system, which will allow the splitting of data through the
sharding technique [6].
Elasticity is also a core feature of NoSQL databases. In times of overload, elasticity
is characterized by the adaptation of the system in such situations. It is developed in a
horizontal scalability environment and aims to manage and allocate available system
resources to balance them when access distribution is exceeded. It is important to note
that it has a number of user-friendly features, ranging from being a distributed, scalable
and elastic, to its fault-tolerance system, which can be concluded to be of great benefit
to the user [2].
In addition, NoSQL databases are divided into four data model types, including [7]:
• Document Databases - stores data in document structure and encodes the infor-
mation in formats such as JSON. Ex.: MongoDB, CouchDB.
478 F. Hak et al.
• Graph Databases - emphasizes connections between data, storing related nodes as

objects and relationships as edges in graphs to speed up the query. Ex.: Neo4j,
GraphDB.
• Key-value Databases - uses a simple data model that matches a single key to a value
in data storage. Ex.: Redis, Dynamo, Oracle NoSQL DB.
• Wide Column Stores - column families oriented, is also called Table-Style Data-
bases and stores data through tables that can have a large number of columns. Ex.:
Cassandra, HBase.
2.2 Oracle NoSQL Database

Oracle’s NoSQL open-source solution was chosen due to key-value approach and for
partnership reasons. Oracle NoSQL Database provides data manipulation with some
traditional particularities from the NoSQL definition, such as scalability, non-relational
database, and elastic storage. It is also characterized by providing flexible schemas, fast
load and sharding and replication techniques, using the key-value storage [8].
First release from Oracle NoSQL Database came out in 2011, being its current
release from 2019. The database requires a Linux operating system and can be accessed
by REST API methods. Oracle NoSQL Database provides Community and Enterprise
Edition with a set of characteristics that set them apart. Another feature is the KVLite
version which allows simplified database deployment on just one Node.
In this way, the Oracle’s solution offers a three-tier architecture (Fig. 1) of pre-
sentation, logic and database. The first one, represents the client interface, that is, the
output of a request made by the user. The second, is composed by the Oracle NoSQL
Driver that consumes Oracle Berkeley Database library [9], which is responsible for
controlling functionalities required by the executed process and for the data distribu-
tion. The last one, is the database, that stores using the key-value schema.
Fig. 1. Three-tier architecture
Following a key-value store, data is hashed in partitions through the primary key,
building replication groups, also called shards. Consequently, distributed data are
stored in Storage Nodes (SN), that represent a physical machine with its own memory,
storage and IP address. Each SN contains Replica Nodes (R) that perform writing and
reading functions [10]. Thus, as the number of SN’s increases, the better the system
will perform.
2.3 Clinical Data Repository

Health care delivery encompasses a complex procedure that involves many different
professionals, resources and functions. This results in a large amount of diverse and
scattered data. Thus, the concept of Clinical Data Repository (CDR) grows as the need
for common sharing storage does. It was firstly described as a “clinical research
database that provides a detailed, flexible and quick view of clinical data” in 1995 at
University of Virginia [11].
Recently, Gartner [12] described the CDR as a “aggregation of patient-centered
granular health data, generally collected from heterogeneous systems and intended to
support multiple functions”. In this way, the data undergoes a specific organization by
analysis, being the CDR understood, in such a way, as Clinical Data Warehouse
(CDW).
However, a CDR aims to be distributed and to integrate clinical support rules in
order to acquire clinical knowledge, as the opposite of a CDW [13]. In this case,
decision support mechanisms are represented by a controlled medical vocabulary and
by clinical guidelines previously applied. CDR also promotes internal interoperability
and incorporates an architecture the same reasoning and mechanisms as a CDW but
focuses mostly on gaining knowledge from clinical data.
3 Methods and Tools
The learning process in each domain is applied through a methodological approach.

Therefore, the implementation of the Design Science Research (DSR) encompasses the
scope of scientific research in the field of information systems. Collins, Joseph, and
Bielaczc [14] state that the insertion of scientific research in the area of information
systems becomes crucial due to the need to address theoretical issues and the study of
practical cases already carried out in the field.
At the technological level, the study included the practice and theory of using
Oracle’s NoSQL tool, being its exploration one of the proposed objectives. The use of
Oracle’s NoSQL solution was crucial to propose a Clinical Data Repository architec-
ture. Thus, this tool has characteristics of a NoSQL database such as horizontal scal-
ability, elasticity and distributed storage, focusing on key-value schema [8].
4 Proposed Architecture
Nowadays, data production and the need to achieve results have been grow expo-
nentially worldwide, especially in healthcare. This advancement has been extremely
remarkable in recent years, resulting in improved patient care and a more robust
decision-making support.
Afterwards clinical data registration by a health professional, each record is stored
in a different source, depending on its context and type. According to adopted
approaches in previous studies, the open data model allows to combine knowledge with
clinical information, following guidelines and clinical coded terms in a structured way.
480 F. Hak et al.
This required in-depth research on how to manipulate this data produced by the
healthcare organization and to turn it into intelligent and optimized solutions. As it is
exposed in Fig. 2, the new CDR requires a solution that can support large amounts of
data and flexibility, providing decision support.
Fig. 2. Process overview
Therefore, the proposed architecture of the Clinical Data Repository is based on

Oracle NoSQL Database (Fig. 3), that aims to store and manage data in different
structures, combining relational and nonrelational databases. This new technique aims
to combat inefficiencies that relational databases brought to these services [15].
Fig. 3. Clinical Data Repository architecture
Through the exploration of Oracle NoSQL Database some important concepts for
its correct performance have been identified. Data storage represented by the last tier of
the architecture, is divided in Zones that correspond to physical locations according to
the capacity of the system, could be primary or secondary type. Storage Nodes (SN) are
represented within a Zone, corresponding to machines that perform both data writing
and reading functions [9].
Increasing the number of SN in the system enables better performance and will
decrease system storage latency, as it is formalized in horizontal scalability. In addition,
for effective communication between the SN’s it is necessary to activate the responsible
agent for this function, the Storage Node Agent (SNA), as well as verify its correct
operation [8].
According to the proposed architecture based on N Storage Nodes and X Zones, for
this case study only one SN in a Zone was deployed, represented in grey by SN1 in
Fig. 3. For a correct activity of SN1, some configuration parameters were established
such as the IP address of the respective machine and the communication ports, as well
as the system capacity and administrative security system.
Regarding the distribution of data in the cluster, this is done by the Sharding
technique that distributes the data uniformly by the Shards in a set of Partitions. This is
a fundamental NoSQL method that aims to easily organize and distribute data between
machines through the primary key of each record according to key-value, in order to
not overload the system.
For effective understanding of the data, each Shard makes up a group of Replication
Nodes (R) that perform read functions, the Master Nodes (M) being responsible for
writing. The master node always has the most up-to-date value for a given key as
opposite of read Replicas that can have slightly older versions [9]. Accordingly, the set
of Replication Nodes is called the Replication Factor (RF). In the case applied of
implementing one-single node, the formula used to calculate the number of partitions
required was as follows [8]:
Capacity 1
Partitions ¼ 10 ¼ 10 ¼ 10
Replication Factor 1
According to the topology implemented, the RF number is equal to 1 as well as the

system’s capacity. Hence, it is possible to state that Shard1 has one Master Node and
10 Partitions enabled. Studying all of these concepts of Oracle NoSQL was crucial to
deploy the SN in the proposed architecture, desiring increases the number of nodes in
the future to face the requirements for the Clinical Data Repository.
5 Discussion
This article was aimed to explore a solution to propose an architecture for the new
Clinical Data Repository. It must be qualified in volume, velocity, scalability and
elasticity, that matches with NoSQL concepts. Thus, the Oracle NoSQL Database was
the chosen technology for the proposed architecture, with a one-single node
deployment.
Furthermore, one of the main features that sparked interest in the Oracle’s NoSQL
database was the key-value storage. Being the simplest type of NoSQL data models,
the key-value is based on array and also comparable to dictionary and hash functions,
482 F. Hak et al.
mapping a key to a value. A key is a unique identifier and a value the data identified,
that is a string of bytes in arbitrary length.
The data model is characterized for being schema free due to the fact that each
record can have its own structure as opposed to relational models, giving flexibility to
the database. Hereupon, key-value pairs are located in a Distributed Hash Table (DHT),
allowing a node to effectively access a value through a key filling up scalable
resources [6].
As mentioned before, data distribution is performed by Shards containing a hashed
set of records or partitions, stored based on the primary key. Both the key and the value
are application-defined, given some loose restrictions according to the NoSQL Driver
[9]. In this way, the records inserted in the store are uniformly organized in key-value
pairs in partitions.
In Oracle NoSQL, data is stored in particular shards depending on the hashed value
of the primary key of the table. Thus, the key or primary key are a combination of
major and minor key that the major component identifies the partition which contains a
record and what shard is stored, so all of the records with the same major key will be
co-located on the same server [10].
With all of this data storage and management mechanism, it is important that the
database is configured to track desired performance. This requires that the records do
not focus on the same major key, otherwise the system will suffer performance issues
as the data is entered.
The need to explore new solutions capable to support large amounts of heterogeneous
data led to the characterization of the NoSQL concept. NoSQL and Big Data concepts
are also directly linked when it comes to large amounts of data. In this way, NoSQL
meets the requirements proposed by Big Data characteristics such as Volume, Variety
and Velocity, the 3Vs that characterize Big Data.
Thereupon, that represents the capacity to handle a large amount of data of various
types with different structures, generating and querying data quickly in the store. In this
way, the article was developed to address the lack of scalability and speed of a rela-
tional database system, leading to the exploration of the NoSQL concept as one of the
requirements imposed for the work developed.
As a result, the Oracle NoSQL database was the chosen technology for in-depth
study to its functions and data manipulation with key-value store. The proposed
architecture for the Clinical Data Repository (CDR) comprehends that structure of the
technology.
The study concluded that Oracle’s NoSQL tool has adequate functionality for the
required implementation, particularly in resource allocation and easier troubleshooting.
The key-value data schema is also attractive for future implementation as it has simple
and efficient management of data manipulation. Although there are some restrictions on
its tool installation, Oracle NoSQL brings high expectations for the implementation of
the new Clinical Data Repository.
Future work focuses on building an Oracle NoSQL Database application for the
CDR in a multi-node deployment for better system performance. This targets to a
deepening of clinical knowledge, improving of care service and supporting the
decision-making processes. Business Intelligence techniques for NoSQL database will
also be explored as focal points for future work.
Acknowledgments. The work has been supported by FCT – Fundação para a Ciência e Tec-
nologia within the Project Scope UID/CEC/00319/2019 and DSAIPA/DS/0084/2018.
References
1. Shertil, M., Jowan, S., Swese, R., Aldabrzi, A.: Traditional RDBMS to NoSQL database:
new era of databases for big data. J. Humanit. Appl. Sci. 29, 83–102 (2016)
2. Costa, C., Santos, M.Y.: Big Data: state-of-the-art concepts, techniques, technologies,
modeling approaches and research challenges. IAENG Int. J. Comput. Sci. 43(3), 285–301
(2017)
3. Madison, M., Barnhill, M., Napier, C., Godin, J.: NoSQL database technologies. J. Int.
Technol. Inf. Manag. 24(1), 1–14 (2015)
4. Moniruzzaman, A.B.M., Hossain, S.A.: NoSQL database: new era of databases for big data
analytics - classification, characteristics and comparison. Int. J. Database Theor. Appl. 216
(2895), 43–45 (2013)
5. Anand, V., Rao, C.M.: MongoDB and Oracle NoSQL: a technical critique for design
decisions. In: Proceedings of the International Conference on Emerging Trends in
Engineering, Technology and Science (ICETETS 2016) (2016)
6. Abramova, V., Bernardino, J., Furtado, P.: Experimental evaluation of NoSQL databases.
Int. J. Database Manag. Syst. 6(3), 01–16 (2014)
7. Han, J., Haihong, E., Le, G., Du, J.: Survey on NoSQL database. In: 2011 6th International
Conference on Pervasive Computing and Applications, pp. 363–366. IEEE (2011)
8. Oracle: Oracle NoSQL Database: Fast, Reliable, Predictable, pp. 1–38, November 2018
9. Oracle: Oracle® NoSQL Database: Concepts Manual, April 2018
10. Oracle: Oracle® NoSQL Database: Getting Started with Oracle NoSQL Database Key/Value
API, August 2019
11. Einbinder, J.S., Scully, K.W., Pates, R.D., Schubart, J.R., Reynolds, R.E.: Case study: a data
warehouse for an academic medical center. J. Heal. Inf. Manag. 15(2), 165–175 (2001)
12. Gartner: Information Technology: Clinical Data Repository (2018)
13. Hamoud, A.K., Hashim, A.S., Awadh, W.A.: Clinical data warehouse: a review.
Iraqi J. Comput. Inform. 44(2), 1–11 (2018)
14. Collins, A., Joseph, D., Bielaczc, K.: Design research: theoretical and methodological issues.
Am. Heal. Drug Benefits 3(3), 171–178 (2004)
15. Kunda, D., Phiri, H.: A comparative study of NoSQL and relational database. Zambia ICT
J. 1(1), 1 (2017)
Clinical Decision Support Using Open Data
Francini Hak, Tiago Guimarães, António Abelha,


[email protected], {tsg,mfs}@dsi.uminho.pt,
[email protected]
Abstract. The growth of Electronical Health Records (EHR) in healthcare has

been gradual. However, a simple EHR system has become inefficient in sup-
porting health professionals on decision making. In this sense, the need to
acquire knowledge from storing data using open models and techniques has
emerged, for the sake of improving the quality of service provided and to
support the decision-making process. The usage of open models promotes
interoperability between systems, communicating more efficiently. In this sense,
the OpenEHR open data approach is applied, modelling data in two levels to
distinguish knowledge from information. The application of clinical termi-
nologies was fundamental in this study, in order to control data semantics based
on coded clinical terms. This article culminated from the conceptualization of
the knowledge acquisition process to represent Clinical Decision Support, using
open data models.
Keywords: Clinical Decision Support Clinical knowledge OpenEHR

Terminology
1 Introduction
Upon a patient going to a healthcare unity, a data set about his health is stored in an
Electronic Health Record (EHR) system. Such practice aims primarily to eliminate the
use of paper and has been increasing on a large scale currently.
Commonly, a health facility integrates several heterogeneous systems that, some-
how, speak different languages [1]. These systems must be interoperable, i.e., be able to
communicate in a noticeable and effective manner. This focus is based on building
communication without data loss and on the meaning of its content, specifically
referred to as semantic interoperability.
Thereby, the OpenEHR approach and the clinical terminologies aim to achieve
universal interoperability between EHR systems [2]. The first, structure archetypes to
represent clinical concepts, and the second is based on the use of structured vocabu-
laries correlative to clinical terms. Thus, data exchange between systems does not
compromise the quality of receptive information.
Nevertheless, a simple EHR system became incapable to support decision making
in daily basis, because its primary role was merely to store and consult clinical records
[3]. Hence, this inefficiency propelled the need to explore new techniques for gaining
Clinical Decision Support Using Open Data 485
clinical knowledge and semantic interoperability in order to improve decision process

in healthcare.
In this sense, this case study was aimed to explore the OpenEHR and clinical
terminologies, in order to improve semantic interoperability between systems and to
formalize the Clinical Decision Support practices, providing open clinical knowledge.
This paper is divided in five sections. The first one introduces the groundwork of
the present case study. The second describes key concepts in the applied field such as
open data, OpenEHR, clinical terminologies and clinical decision support. The third
section presents the methodology applied to develop this case study. Following the
Knowledge Acquisition method, the project development is exposed in the fourth
section. At last, in section five, conclusions are drawn from the work done and it
describes the future work.
2 Background
2.1 Open Data
The Open Knowledge International’s Foundation [4], the most important international
pattern to open data, granted the first definition to open data as “data that can be used in
a freeway, shared and built for anyone, anywhere and to everything”. Consequently,
open knowledge is defined as “any content, information or data that society is free to
use, reuse and redistribute, without any legal, social or technological restriction”.
To summarize, the term “open data” has gained popularity in the transparency and
open government movement around the world, as it manages the access to public
information as a rule and can be freely used, modified and shared by anyone and
without any financial or other restrictions and it is also applied to the health domain [5].
2.2 OpenEHR
The OpenEHR approach follows the open data and free access standard for health
information specifications and is used in management, storage and querying of elec-
tronically clinical data. The use of this model provides an interoperable framework that
organizes clinical content with patient information, thus enabling integration with
different health information systems [6].
The OpenEHR Foundation states that OpenEHR has “multilevel single-source
modelling within a service-oriented software architecture where models built by
domain experts are in their own layer”. In this sense, the OpenEHR architecture
consists of two levels, which information and knowledge are separated.
In the first level, the information model groups and defines the information pro-
cessed in the system for each patient, followed by information components such as
quantity, text or date concepts. The second model holds the clinical knowledge applied
in a structured and archetype-oriented manner, according to the Archetype Definition
Language (ADL), promoting semantic interoperability [7].
486 F. Hak et al.
2.3 Clinical Terminologies

The mutual understanding of the vocabulary used in services or data between all the
actors in a system, is represented by the semantic interoperability [8]. To facilitate and
solve semantic problems in healthcare systems, the concept of clinical terminology is
applied.
The use of clinical terminologies aims to promote semantic interoperability in EHR
systems, which clinical terms are coded and represented according to a given standard
[9]. Therefore, these terminologies are referred as the meaning or expression of clinical
concepts where it is composed of terminological reasoning based on the classification,
relationships and comparison of individual concepts imposed on the system [10].
Nowadays, there are many examples of health terminology and classification
systems. However, this case study will use only the International Classification of
Diseases version 10 (ICD 10) and the Systematized Nomenclature of Medicine Clinical
Terms (SNOMED CT), although they will not be delved further.
ICD uses a common language in several classification countries that usually refers
to the patient’s overall situation and can expose symptoms, illness, injury and even
death [11]. This is based on translating sentences for diagnosing diseases or other
health problems into alphanumeric codes for all universal epidemiological purposes.
On the other hand, SNOMED CT is a health focused terminology that has a
medical nomenclature for numerical coding of clinical terms. The core part of SNO-
MED’s operation is conceived through a hierarchical structure of three components:
concepts, descriptions and relationships [12].
2.4 Clinical Decision Support

Friedman [13] presented a theorem that states that an individual or group working with
a technological resource performs better than one who works without assistance. For
the validation of this theorem the information system must be valid and reliable, and
the user must know how to use it properly.
The activity to support decision makers in healthcare is designated as Clinical
Decision Support (CDS). A CDS is an activity or a service that involves a range of
mechanisms that represent clinical knowledge in a structured way that aims to support
all involved in healthcare domain [14].
Nevertheless, the need to provide such practice in a computable application or tool,
is represented by a Clinical Decision Support System (CDSS), combining engines and
predefined rules to provide CDS. Over an efficient and reliable information system
infrastructure to enhance CDS practices, the CDSS gives patient related information to
the user in any decision-making moment in healthcare.
Most EHR systems are encouraged to or already include CDS practices to build a
CDSS. For such purpose, knowledge acquisition techniques are required to deploy and
organize rules in a knowledge base, in order to support decision making [15].
3 Knowledge Acquisition
To complement this study, the Knowledge Acquisition methodology was applied in

order to develop a Clinical Decision Support framework. This process is characterized
by the extraction, structuring and organization of replicated human knowledge adapted
to a machine-readable format [17]. This model has several phases, but this project is
restricted to four steps.
3.1 Identification
The need to acquire knowledge through daily production of electronical health records
(EHR) has spread. However, it is essential to distinguish three concepts for a better
understanding of what is approached as clinical knowledge.
According to Zins [18], data are unstructured facts resulting in a qualified or
quantified interpretation without context. Information is the contextualized and struc-
tured data, following a purpose. In this sense, knowledge is the information acquired
with theoretical and practical understanding based on experience, involving informed
individuals, practices and organizational norms in a given domain.
As the knowledge encompasses several areas of learning, the clinical scope is also
covered. Therefore, the clinical knowledge is defined by Winters-Miner et al. [19] as
the “cognitive understanding of a set of known clinical rules and principles based on
the medical literature, that guide decision-making processes”.
3.2 Conceptualization
In order to complement EHR systems in decision support, the clinical knowledge
acquisition through clinical guidelines and practices are crucial for the validation of this
new process. For such purpose, it is necessary to define the essential components to
acquire the desired clinical knowledge.
Thereupon, the new process is based on the implementation of the OpenEHR
approach and clinical terminologies for the construction of the new open knowledge
model. Following the essence of the OpenEHR approach, clinical records inserted in a
determined system are modelled on two levels [20].
OpenEHR architecture models data in two levels, sorting out information and
knowledge (Fig. 1). In this sense, information is defined as quantified and qualified
patient-oriented data, as well as their respective demographic data, grouped in the
OpenEHR reference model.
488 F. Hak et al.
Fig. 1. OpenEHR architecture
On the other hand, knowledge aggregates the clinical content based on medical
guidelines that are represented by archetypes, in order to form templates that document
such knowledge. For proper implementation, archetypes and templates must follow a
set of rules encoded in Archetype Definition Language (ADL).
The ADL defines the structure of the document that embodies medical knowledge,
represented by the templates. In order to code clinical terms, the terminologies are also
implemented on the structured template and it is applied directly on template
modelling.
To summarize, this mechanism separates these two concepts to allow the manip-
ulation of data in an organized manner, relating them by the need to fill the template
components with patient information.
3.3 Formalization
The two-level modelling approach defines an archetype-based architecture which
provides the desired information of a given patient, combining clinical knowledge
without losing the clinical meaning of the content and preserving the confidentiality of
each data as intended by the patient [21].
OpenEHR methodology focuses on interoperability between systems, with the
adoption of open specifications and clinical content. Thus, the main purpose of such
adoption is to transform clinical records into a structured and interpretable model.
The formalization of this new process (Fig. 2) is initiated by the representation of
the health professionals who represents all entities involved in a health facility that
have authority to record both administrative and patient data in an EHR system.
Fig. 2. Knowledge modelling
OpenEHR methods allow forking data in two directions. Information model is

understood as the data semantics in a structured form, constituting the OpenEHR
reference model. In opposite, the knowledge model focuses on clinical content, rep-
resented by archetypes and templates through a set of rules.
The knowledge model is then developed by an expert professional from a certain
domain. It selects and aggregates a set of archetypes according to the intended clinical
purpose for building a template through the appropriate platform OpenEHR Clinical
Knowledge Manager.
Clinical terminologies and classification systems are implemented and controlled in
template modelling. The use of coded clinical terms is required for the representation of
appropriate diagnoses or symptoms of a given patient, providing semantic interoper-
ability in template components. In addition, ADL rules are required for proper func-
tioning and modelling consistency for future information management and storage.
After the modelling process is completed, the resulting templates are transformed
into a structured view format, labelled as form. This form displays, to the health
professional, the representation of clinical knowledge along with previously registered
patient information. With this implementation, the reverse process of logging data via a
form is also applied.
3.4 Knowledge Representation

The formalization of clinical knowledge and information was applied and articulated in
parallel to a set of rules. The large amount of data resulting from that distinction has
generated a substantial need to exploit a new storage technology resource.
Thereupon, the need for a new clinical data repository (CDR) system and the
representation of clinical decision support (CDS) activity were crucial aspects to be
engaged to increase quality in health care.
490 F. Hak et al.
Thus, CDR is defined as a distributed real-time database that allows data storage
originally entered from other clinical data sources [22]. Their function is querying data
in an easy and arbitrary way to a possible analysis of reports and results.
In order to represent clinical decision support practices, an architecture was pro-
posed that represents such activity using open clinical knowledge techniques (Fig. 3). It
is crucial to emphasize the need to distinguish activity and technology. Thus, the CDR
was also highlighted and was developed in parallel with another case study.
Fig. 3. Clinical Decision Support framework
After data registration and collection on a specific system, it is stored in hetero-

geneous data sources and represented in a database management system (DBMS). In
parallel, the modelling practice of OpenEHR and clinical terminologies is characterized
by the open knowledge model, resulting in the creation of a knowledge base.
CDS practices focus mostly on the knowledge base of the system fuelled by clinical
knowledge. The open knowledge model loads its knowledge to this base as a pro-
cessable and interoperable format by machines. As a result, migration rules are applied
between DBMS and Knowledge Base in order to associate patient information with
clinical knowledge in a structured form.
In this way, the CDS module has key elements that characterize it, such as the
organized representation of knowledge, the structured and controlled vocabulary for
clinical concepts and the knowledge base. Extract, Transform and Load (ETL) process
are also applied which allows querying and cleaning data for future interaction with
decision support components.
The final layer of decision support activity integrates the computer system that will
represent such acquired knowledge. A CDSS is a technology or tool that integrates
patient knowledge and clinical information in a structured way to support decisions and
actions in health care delivery. For that purpose, other analytical tools are also applied,
representing data analysis.
This case study aimed to explore components that provide clinical knowledge in order
to build a Clinical Decision Support (CDS) module to improve EHR systems and
quality in healthcare. Thus, this case study integrates knowledge acquisition methods
such as OpenEHR and clinical terminologies, providing a two-level modelling.
As a result of separating clinical records into information and knowledge, a set of
archetypes and a common controlled vocabulary are modelled to represent clinical
concepts and terms in a structured form. This set of activities characterizes the open
knowledge model, that aggregates templates and visual forms following a clinical
purpose.
Thereby, a set of rules to the decision support activity were applied, capable of
ensuring the consistency and coherence of the information that was risen by the use of
such techniques, granting clinical knowledge in an organized and standardize way.
Overall, the clinical decision support activity is characterized by a knowledge base
through the open knowledge model. As a result, these interactions allow the system to
be faster, interoperable, organized and easier to use. In addition, the new process
applied allows professional with no clinical expertise to be able to intervene and
contribute to the conceptualization and structuring of health information systems.
To sum up, the CDS activity complements a simple EHR system, providing clinical
guidelines and documented models, to find suitable standards for representing clinical
data in order to achieve decision support benefits in healthcare.
Future work is sustained by two approaches. Firstly, the use of open knowledge
models will be continued in order to framework all the necessary templates for
structured visualization of clinical information through forms. In the second instance,
the Clinical Data Repository will be explored and implemented, in real time, incor-
porating information and knowledge models.
Acknowledgments. The work has been supported by FCT – Fundação para a Ciência e Tec-
nologia within the Project Scope UID/CEC/00319/2019 and DSAIPA/DS/0084/2018.
References
1. Peixoto, H., Machado, J., Abelha, A.: Interoperabilidade e o Processo Clínico Semântico, no.
513, p. 8846 (2010)
2. Min, L., Tian, Q., Lu, X., Duan, H.: Modelling EHR with the openEHR approach: an
exploratory study in China. BMC Med. Inform. Decis. Mak. 18(1), 1–15 (2018)
492 F. Hak et al.
3. Ribeiro, T., Oliveira, S., Portela, C., Santos, M.: Clinical workflows based on OpenEHR
using BPM (2019)
4. Open Knowledge International’s Foundation: Open Knowledge International Foundation
(2005)
5. Pires, M.T.: Guia de dadados abertos. J. Chem. Inf. Model. 53(9), 1689–1699 (2015)
6. Filho, C.H.P., de Freitas Dias, T.F., Alves, D.: Arquétipos OpenEHR nas fichas do fluxo do
controle da tuberculose. Rev. da Fac. Med. Ribeirão Preto e do Hosp. das Clínicas da FMRP,
January 2014
7. César, H., Bacelar-Silva, G.M., Braga, P., Guimaraes, R.: OpenEHR-based pervasive health
information system for primary care: first Brazilian experience for public care. In:
Proceedings of the CBMS 2013 - 26th IEEE International Symposium on Computer-Based
Medical Systems, pp. 572–573 (2013)
8. Heiler, S.: Semantic interoperability. Encycl. Libr. Inf. Sci. Third Ed. 27(2), 4645–4662
(1995)
9. Park, H.-A., Hardiker, N.: Clinical terminologies: a solution for semantic interoperability.
J. Korean Soc. Med. Inform. 1515(11), 1–111 (2009)
10. Rector, A.L.: Clinical terminology: why is it so hard? Methods Inf. Med. 38(4–5), 239–252
(2000)
11. Breant, C., Borst, F., Campi, D., Griesser, V., Momjian, S.: A hospital-wide clinical findings
dictionary based on an extension of the International Classification of Diseases (ICD). In:
Proceedings of the AMIA Symposium on ICD, pp. 706–710 (1999)
12. Cornet, R., Schulz, S.: Relationship groups in SNOMED CT. J. Sci. Islam. Repub. Iran
26(3), 265–272 (2009)
13. Friedman, C.P.: A ‘fundamental theorem’ of biomedical informatics. J. Am. Med. Inform.
Assoc. 16(2), 169–170 (2009)
14. International Health Terminology Standards Organisation (IHTSDO): Decision Support with
SNOMED CT. SNOMED CT Document Library (2018)
15. HIMSS: What is Clinical Decision Support System? (2016)
16. Collins, A., Joseph, D., Bielaczc, K.: Design research: theoretical and methodological issues.
Am. Health Drug Benefits 3(3), 171–178 (2004)
17. Liou, Y.I.: Knowledge acquisition: issues, techniques and methodology, pp. 59–64 (1985)
18. Zins, C.: Conceptual approaches for defining data, information and knowledge. J. Am. Soc.
Inf. Sci. Technol. 58, 479–493 (2007)
19. Winters-Miner, L.A., et al.: Biomedical informatics. Pract. Predict. Anal. Decis. Syst. Med.
42–59 (2015)
20. Pereira, V.A.A.: Governance of an OpenEHR based local repository compliant with
OpenEHR International Clinical Knowledge Manager. J. Chem. Inf. Model. 53(9), 1689–
1699 (2018)
21. Santos, M.R., Bax, M.P., Kalra, D.: Building a logical EHR architecture based on ISO 13606
standard and semantic web technologies. Stud. Health Technol. Inform. 160(Part 1), 161–
165 (2010)
22. Nadkarni, P.: Clinical data repositories: warehouses, registries, and the use of standards.
Clin. Res. Comput. 173–185 (2016)
Spatial Normalization of MRI Brain Studies
Using a U-Net Based Neural Network
Tiago Jesus1(&) , Ricardo Magalhães2,3 , and Victor Alves4

1
Department of Informatics, School of Engineering, University of Minho,
Braga, Portugal
[email protected]
2
Neurospin, Joliot, CEA, Gif-Sur-Yvette, France
[email protected]
3
Université Paris-Saclay, Gif-Sur-Yvette, France
4
Algoritmi Centre, University of Minho, Braga, Portugal
[email protected]
Abstract. Over recent years, Deep Learning has proven to be an excellent

technology to solve problems that would otherwise be too complex. Further-
more, it has seen great success in the area of medical imaging, especially when
applied to the segmentation of brain tissues. As such, this work explores a
possible new approach, using Deep Learning to perform spatial normalization
on Magnetic Resonance Imaging brain studies. Spatial normalization of Mag-
netic Resonance images by tools like FSL, or SPM can be inefficient for
researches as they require too many resources to achieve good results. These
resources include, for example, wasted human and computer time when exe-
cuting the commands to normalize and waiting for the process to finish. This can
take up to several hours just for one study. Therefore, to enable a faster and
easier method to normalize the data, a U-Net based Deep Neural Network was
developed using Keras and TensorFlow. This approach should free the
researchers’ time for other more relevant tasks and help reach conclusions faster
in their studies when trying to find patterns between the analyzed brains. The
results obtained have shown potential by predicting the correct brain shape in
less than 10 s per exam instead of hours even though the model did not yet
accomplish a fully usable spatial normalized brain.
Keywords: Deep Learning Neuroimaging Spatial normalization
1 Introduction
Medical imaging, as the name implies, is an area that deals with the process of visu-
alizing the interior of the human body for medical purposes. There are several types of
medical imaging modalities, e.g. Magnetic Resonance Imaging (MRI), Computed
Tomography (CT), Positron Emission Tomography (PET) whose images may be
structural, functional or molecular depending on the study objective. Some of these
modalities can also be combined by using medical image fusion to take advantage of
the best parts of each modality [1].
494 T. Jesus et al.
One of the possible applications of the above-mentioned modalities is the acqui-

sition of brain images (neuroimaging), which can be very helpful for different types of
studies. Neuroimaging data can help understand how the brain works, how its different
areas respond to different stimuli, or help diagnose multiple psychiatric disorders and
diseases, such as tumors and multiple-sclerosis.
In MRI examinations, especially in brain studies, spatial normalization of the brain
is typically required [2]. This normalization process is essential because brain shapes
are different for each person. It allows researchers to draw conclusions about brain
regions by transforming the brain shape into a common space using a template as a
reference. Currently, the FSL library is the most commonly used software library to
achieve spatial normalization of the brain. It is a very complex process involving
several linear and non-linear transformations, using essentially three FSL tools: Bet,
Flirt and Fnirt [3–7]. In short, the brain is firstly extracted from the skull with the Bet
tool to then linearly approximate the image to the MNI-152 template (the reference
image [2]). This transformation is then enhanced by a non-linear transformation with
the Fnirt tool. However, these tools can take a long time to produce the output, which
may not always live up to expectations. A better solution would therefore improve the
day-to-day work of researchers.
Deep Learning (DL), a subset of Artificial Intelligence [8–10], has already proven
to be a fundamental tool in medical imaging [11–14]. Many studies mention its benefits
in this area. Deep Learning is shown to produce faster and more accurate results than
non-DL algorithms/methods when using a suitable model. Park et al. (2019) [15] for
example, investigated a deep-learning based recognition system that demonstrated
significantly higher image-wise classification performance and demonstrated the
potential for lesion detection and pattern classification in chest X-ray images, thereby
achieving high diagnostic performance. In other publication, Nam et al. (2019) [16]
developed a deep-learning based automatic detection algorithm that outperformed
physicians’ performance in classifying X-ray images and detecting malignant pul-
monary nodules on chest X-rays.
After considering all the disadvantages of existing tools and the positive aspects of
deep learning in medical imaging, the aim of this work was to develop a Deep Artificial
Neural Network to perform the process of brain normalization faster and more auto-
mated than other already existing solutions like the FSL tools. In this way, researchers
save time to focus on other more challenging tasks when they need to normalize a
series of brain images for their studies.
2 Materials
The dataset used, stored on a XNAT server, contained a total of 213 MPRAGE [17]
images that were properly anonymized to ensure the privacy of the subject. They were
divided into three groups of 71 in NIfTI format, with each image in a group having a
match in the other two. The groups were as following: a group of original studies (the
raw data obtained on MRI without spatial normalization), studies with the skull
extracted from the brain (obtained from the corresponding original study) and finally
normalization of the first one (result of the normalization of the first group by an
Spatial Normalization of MRI Brain Studies 495
already existing tool) which is used as the ground truth. From these groups, the
‘original’ was used as input for the deep learning model and the ‘normalized’ as output.
The brain extracted group was only used to preprocess the input for the model.
In order to build and train this model, a programing language, as well as a suitable
working environment was needed. The programing language chosen was Python,
which has very comprehensive imaging and DL libraries. Jupyter Notebooks was used
as the environment in which the tools described in this work were created. To create the
model architecture itself, Keras and TensorFlow were used. Keras is a high-level yet
user-friendly library, that provides high-level building blocks for developing Deep
Artificial Neural Networks [8]. TensorFlow offers multiple levels of abstraction and
extremely powerful structural bases for the network especially when using it with
Keras which makes it easy to use and allows fast prototyping. One main advantage of
these tools is the possibility of using the GPU as well as the CPU to train the model.
The settings of the computer on which this work was performed were Ubuntu 16 as the
operating system, an Intel Xeon 12 cores processor (with 64 GB RAM) and a
NVIDIA QUADRO P6000 GPU (with 24 GB of GDDR5X dedicated memory).
3 Methods
3.1 Data Preprocessing

To achieve a perfect match between the data and the model, several steps are required.
First, a supported data format is required. The Deep Neural Network model does not
just accept the MRI images as they are. To address this issue, a python library called
NiBabel is used, which can process NIfTI format files to extract the data from each
scan into NumPy arrays. The next problem is the data itself. The model must have
meaningful data at its input. In other words, not all data will produce good results.
Therefore, the data from the scans that are now in NumPy arrays must be normalized
for good and reproducible results. This can be accomplished by performing several
simple operations on the data arrays. The normalization step cuts off the edges of the
non-normalized images, leaving the brain intact. This insures that the brain part of the
image is intact, saving space in memory, which is helpful in training the model. The
next step is to make sure all images have the same dimensions. To solve this, the
images are padded with zeros to reach a final size. It has been added to the images,
rather than removed, to preserve as much valuable information as possible and not
tampering possible crucial values. Finally, the values of all the arrays must be between
0 and 1, to ensure that the model can draw conclusions from all images regardless of
their maximum or minimum value. Since the minimum value of all arrays was already
0, the maximum value was the target for normalization. The solutions to represent
values between the required interval was to divide each non-normalized image by its
maximum and the corresponding normalized image by the same value. This ensures
that the maximum value in each image is 1.
The preprocessed dataset is divided into train and test NumPy arrays, which are
stored in four NumPy files, so that the data can be loaded faster when needed. In this
way, the data will be preprocessed only once and loaded much faster if the model needs
496 T. Jesus et al.
to be re-trained. From all scans, 50 were used for training (about 70% of the dataset)
and the remaining 21 (about 30%) were used for testing. The train set is used to train
the model and the test set evaluates the model skills during the training.
After successfully loading the data from the files, the next step was to create and
compile the models and then train them.
3.2 Spatial Normalization Model

The Deep Artificial Neural Network was created in a separate python module
for simplicity. To create and compile the model we simply call the function
“create_compile_model()” contained in this module. This function adds layers to create
the U-Net based structure of the model (shown in Fig. 1).
Fig. 1. 2D U-Net based architecture used.
The model is then compiled using Dice Similarity Coefficient (DSC) as the loss
function and Adam (Adaptive Moment Estimation) with a learning rate of 1 10−6 as
the optimizer. As activation functions, ReLU and Leaky ReLU were strategically used.
Several learning rates were experimented, but the one given above was the one that
yielded the best results.
This module also contains helpful parameters such as the input size and callbacks
that can be used when training the model. The callbacks used were: ModelCheckpoint,
which stores the weights of the model at defined points of training, in this case, stores
the best possible weights of training; EarlyStopping which monitors the model training
and automatically terminates the training if the model does not learn as desired;
ReduceLROnPlateau which changes the learning rate (lr) of the model to adapt its
learning and hopefully get better results; Finally, PlotLossesKeras which implements a
plot that updates each epoch to display the model history, i.e. accuracy and loss at each
epoch. All of these callbacks are very helpful and improve the chances of getting good
results with the model.
First, the model is created and compiled with the previously explained function.
Next, some important settings will be set, such as the number of epochs for training the
model as well as the batch size and the callbacks to use. As mentioned earlier, the
functions that are present in the callback part of the settings are specified in the model
module.
Both the number of epochs and the batch size have been changed to influence the
training of the model to improve model performance. Where the number of epochs
represents the number of passes through the training samples during training. The batch
size essentially indicates the number of samples submitted to the model at each time
point. As expected, a larger batch size means a higher number of images presented to
the model and, therefore, more memory is needed to hold all the data.
After training with the GPU, the model is evaluated to see if it performs as
expected. It’s structure in “.json” format and weights in “.h5” format using h5py
package are then saved on the disk for further reference. The weights of the model at
the best stage were saved. At this point, the charts of loss and accuracy of the model are
displayed. Thanks to the PlotLossesKeras callback (Fig. 2) they can also be displayed
in constantly updated charts throughout the training process. These are then used to
measure the performance of the model.
Fig. 2. Evolution of the loss when training the model
3.3 Evaluation
After training with the given dataset, the model was evaluated to understand its per-
formance. This was done using the module for evaluation and prediction. The evalu-
ation evaluates the model and outputs its scores, loss and accuracy, after the training.
The prediction uses the model and predicts an output for the given input.
498 T. Jesus et al.
Various parameters can be used to evaluate the properties and performance of a

model in predicting its output. This evaluation can be as simple as comparing the class
predicted by the model with the actual class, and if both are equal, the model performed
as expected, or it could a little more complicated, such as comparing the areas of two
features to see if they are as close as possible. Different models require different
approaches so, different metrics were used to evaluate the performance of the models.
Normally, the main metric for evaluating the model is the accuracy metric which
indicates the percentage of correctly predicted outputs.
Although this seems to be the best metric for evaluating a model, accuracy is not
always the best bet. For example, in the case of spatial normalization, this metric is not
a valid method of evaluating the model. This is because in this case, the accuracy does
not understand the values it is dealing with. In this case, each pixel of the image in the
output is analyzed. If the value for the pixel is not equal to the expected value, the
accuracy metric doesn’t understand the context of the value, it just outputs if the value
was identical or not regardless of the fact that close values could be accepted as a good
prediction. As an example, Fig. 3 shows two images side-by-side, where one of them
has all its pixels tampered from the original. To all its pixels it was added the value 1.
The image on the left is the original and the one on the right is the tampered version.
Although the images are considered to be different for the computer and as such are
considered by the model to be a poor prediction, they are in fact the same as they are
intended to be interpreted by a human rather than a computer. Therefore, we should
properly evaluate the results by obtaining a visual representation of the model
prediction.
Fig. 3. Comparison of an original and tampered image.
If the values differ greatly from the target values, the image is mispredicted, but a
range of values that approximate the expected ones could be accepted as a good
prediction. An accuracy of, for example, 90% or more would probably mean that the
model is working as expected, as it will in fact normally get the predicted output
correctly to achieve such high accuracy. However, low accuracy does not necessarily
mean that the model does not predict correctly as already shown.
In order to overcome this issue, an alternative method of evaluating the results was
needed. To correctly evaluate the model performance, the dice loss was used in the U-
Net based model. This loss function is based on the Dice Similarity Coefficient
(DSC) [18]. The Sørensen–Dice Coefficient or Dice Similarity Coefficient is used to
measure the overlap of two areas. If the overlap is perfect, the result of DSC is 1, which
means that 100% are overlapped. On the other hand, a DSC of 0 would mean areas that
are completely spaced apart, or an overlap of 0%. Equation 1 describes the DSC, where
X and Y denote the two different regions for which the DSC is to be calculated. In the
case of the model, X represents the expected image and Y the predicted output.
jX \ Y j
DSC = 2 : ð1Þ
j X j þ jY j
The Dice Loss (Eq. 2), in contrast to the DSC on which it is based, tends to 0 as the
overlap improves. A Dice Loss of 0 would mean a perfect overlap and consequently a
model predicting correctly the output.
jX \ Y j
Dice Loss ¼ 1 DSC = 1 2 ð2Þ
j X j þ jY j
4 Results
The output predicted by the model for a random input case is shown in Fig. 4. In the
figure the input is the column ‘FSL in’, the output by the FSL, or the considered ground
truth, is ‘FSL out’ and the output predicted by the model is ‘DL out’. A visual rep-
resentation of the data is required because the metrics used to evaluate the model are
not always able to properly evaluate them in every situation.
The resulting Dice Loss value obtained with the U-Net based architecture was
0.00313, which means that the overlap was near perfect as 0 would mean a perfect
result. The accuracy metric did not evaluate the model as it should, as it was a constant
at 16.69% throughout the training of the model which took about 6 h with the high-
performance GPU. Although the training takes quite a long time to complete, the model
is able to, at the end, overcome the FSL tools in time taken to normalize MRI studies.
The model performed the process in an average of 8 s instead of more than an hour as
FSL.
500 T. Jesus et al.
Fig. 4. Visual comparison between the original image, the image obtained by FSL and the
prediction of the trained model.
For a perfect result, the images of the third column should match with images of the
second, since the second is the ground truth obtained with FSL and the third is the
output of the trained DL model. As can be seen, the results are not perfect, but they are
close to the target image as the shape of the output coincides almost perfectly to what
was expected. In particular, when examining the top row, shown in more detail in
Fig. 5, we can see that the model has correctly predicted the features. Hence, good
results were achieved, but not good enough to compete with existing tools.
Fig. 5. Comparison between the expected (Left) and the obtained result by the U-Net based
(Right).
By analyzing the model’s dice loss function, which measures the overlapping areas
of the images, its value gets very close to 0, which means that the model performed
well while deforming the original brain to get the final shape, even though the image as
a whole does not look as expected. The accuracy graph however, is not shown because
it is not a good measure of how well the model has learned in this particular case. This
is because even though two corresponding pixels have very similar values but are not
identical, this is considered a misprediction, although this might be acceptable.
5 Conclusions
To conclude, the results achieved in this work open the path to a yet unexplored
possibility for spatial normalization of brain MRI studies. Although the model is not
yet able to compete with the already existing tools while performing the full normal-
ization, the shape was accomplished correctly having a dice loss value of 0.00313 at the
final stage of training. This means 99.687% of the predicted output was overlapped
with the expected image, i.e. an almost perfect shape was predicted. The model was
also able to outperform existing tools in time spent to normalize. Although the training
process took about 6 h with the high-performance GPU, the model performed the
prediction in an average of 8 s instead of more than an hour as FSL did with the same
MRI study. This is an advantage of the Deep Learning approach as the slow training
needs to be made only once and the prediction can be made quickly as many times as
needed instead of the existing tools where a lot of time is always used to perform the
process. With some more modifications to the model it could achieve even better
results. For example, adapting the model to predict the warp matrix (as the one gen-
erated by FSL) instead of a full normalized image. The matrix could then be used in the
FSL to quickly perform the normalization as the matrix contains all the information
needed to distort the original image to obtain the normalization. Other possible strategy
would be to use a 3-dimensional model instead of a 2-dimensional like the one used.
This could be done by adapting the convolutional layers of the model to perform
3-dimensional convolutions. Although being a computationally harder approach, it
would probably achieve better results.
Acknowledgements. This work has been supported by FCT – Fundação para a Ciência e
Tecnologia within the R&D Units Project Scope: UIDB/00319/2020. We gratefully acknowledge
the support of the NVIDIA Corporation with their donation of a Quadro P6000 board used in this
research.
References
1. James, A.P., Dasarathy, B.V.: Medical image fusion: a survey of the state of the art. Inf.
Fusion 19(1), 4–19 (2014)
2. Poldrack, R., Mumford, J., Nichols, T.: Spatial normalization. In: Handbook of
Functional MRI Data Analysis, pp. 53–69. Cambridge University Press (2011)
3. FSL Wiki page. https://fsl.fmrib.ox.ac.uk/fsl/fslwiki. Accessed 18 Nov 2019
4. BET. https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/BET. Accessed 18 Nov 2019
502 T. Jesus et al.
5. Smith, S.M.: Fast robust automated brain extraction. Hum. Brain Mapp. 17(3), 143–155
(2002)
6. FLIRT. https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLIRT. Accessed 18 Nov 2019
7. FNIRT. https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FNIRT. Accessed 18 Nov 2019
8. Chollet, F.: Deep Learning with Python. Manning Publications Co. (2018)
9. Buduma, N.: Fundamentals of Deep Learning: Designing Next-Generation Machine
Intelligence Algorithms, vol. 44, no. 5 (2017)
10. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
11. Altaf, F., Islam, S.M.S., Akhtar, N., Janjua, N.K.: Going deep in medical image analysis:
concepts, methods, challenges and future directions (2019)
12. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with
region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)
13. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmen-
tation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 640–651 (2017)
14. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image
recognition. In: ICLR, vol. 6 (2015)
15. Park, S., Lee, S.M., Lee, K.H., et al.: Deep learning-based detection system for multiclass
lesions on chest radiographs: comparison with observer readings. In: European Radiology,
pp. 1–10 (2019)
16. Nam, J., Park, S., Hwang, E., Lee, J., Jin, K., Lim, K., Vu, T., Sohn, J., Hwang, S., Goo, J.,
Park, C.: Development and validation of deep learning-based automatic detection algorithm
for malignant pulmonary nodules on chest radiographs. Radiology 290(1), 218–228 (2019)
17. Brant-Zawadzki, M., Gillan, G., Nitz, W.: MP RAGE: a three-dimensional, T1-weighted,
gradient-echo sequence–initial experience in the brain. Radiology 182(3), 769–775 (1992)
18. Liu, Q., Fu, M.: Dice loss function for image segmentation using dense dilation spatial
pooling network (2018)
Business Analytics for Social Healthcare
Institution
Miguel Quintal, Tiago Guimarães(&), Antonio Abelha,

and Manuel Filipe Santos
University of Minho, Braga, Portugal

[email protected], [email protected],
Abstract. A Business Analytics (BA) component aims, in an early phase, to

transform data into information to later generate knowledge. Nowadays, orga-
nizations increasingly tend to deal with large amounts of data. With the absence
of a BA component responsible for the data integration and consolidation
process, organizations end up missing the data’s potential in becoming infor-
mation and, therefore, useful knowledge. BA components also incorporate data
visualization tools, namely dashboards, which provide a quick and intuitive data
analysis that helps the process of transforming data into information. Easy
access to useful information means having full support for the organization’s
decision-making process. This basically leads the organization to face
improvements in their business performance and, consequently, to gain a
competitive advantage over the organizations that perform in the same business
sector. Therefore, it is easy to understand that the benefits of this technology
extend to all business sectors, including healthcare business. The article goals
are understanding what is BA and in which ways a BA component can be
explored in the healthcare business to increase efficiency on the health care
provision on a Social Healthcare Institution. Finally, this article will also present
a developed BA component focused on monitoring the business process of a
Social Healthcare Institution to support the goals mentioned above.
Keywords: Business Analytics component Social Healthcare Institution

Data visualization Information Decision making process
1 Introduction
Nowadays, organizations all over the world are increasingly subject to handling large
amounts of data. Their understanding reflects directly on the organizations’ success, as
a good understanding of data turns it into useful information that allows the organi-
zations to achieve improvements in their business process such as reduced waiting
times and increased service efficiency. Since they are dealing with large amounts of
data, it is hard to complete the task of understanding it in an efficient way without using
technology. This is where the Business Analytics (BA) components come to action.
The preference of a BA component for this job has to do with the efficient process
of achieving knowledge through data understanding that this kind of component
504 M. Quintal et al.
performs, transforming raw data into useful information. Subsequently, this kind of
component also provides an interactive and intuitive visualization of the most relevant
information, making it much easier to analyze and understand it. This leads to better
decision-making processes, more thoughtful decisions, and efficiency on health care
provision.
In general, the literature about BA components focuses more on their usefulness in
industrial organizations. However, healthcare institutions are not far behind when it
comes to reaping the benefits of this technology to improve their business process.
Therefore, this article aims to clarify and justify the usefulness of a BA component to a
Social Healthcare Institution.
2 Background
2.1 Business Analytics

Initially, access to information was direct to operating systems. Data from different
sources were not submitted to information integration and consolidation processes,
consequently promoting the formation of information silos.
Information silos were isolated information systems unable to operate reciprocally
with other information systems that were, or should be, related. Therefore, with the
existence of these “silos”, it was difficult to have an integrated view of the business.
Subsequently, the concepts Data Warehousing, Business Intelligence, Performance
Management and Business Analytics (BA) emerged, putting an end to the damaging
existence of information silos, or at least fostered their absence, which led to a more
integrated view of data and, consequently, a more integrated view of the business.
Today, distinguishing the definitions from the concepts mentioned above has
proved to be an arduous task that has generated discussion around what really differ-
entiates these four concepts from each other.
Eckerson (2008) defines BA as the concept that refers to the processes, technolo-
gies, and techniques that transforms data into information and knowledge that drives
business decisions and actions.
“The cool thing about such industry definitions is that you can reuse them every five years or
so. For example, I used this same definition to describe “Data Warehousing” in 1995,
“Business Intelligence” in 2000, and “Performance Management” in 2005” (Eckerson 2011).
Through the presented quotation, we can then perceive the difficulty in distin-
guishing these four concepts that revolve around the same “industry”, since Eckerson,
in different chronological phases, used the same definition to describe each of these
concepts, concluding that they all have the same purpose.
Therefore, to clarify and understand the definition of BA, there will be presented
alternative definitions of the same concept presented by different authors.
From the perspective of Turban et al. (2008), BA is a component that provides
analysis models and procedures for analyzing DW information to gain a competitive
advantage over other organizations on the same business.
Business Analytics for Social Healthcare Institution 505
According to Schniederjans et al. (2014), BA is the process that begins with the
collection of business data from an organization, to which the main types of existing
analysis are sequentially applied, namely descriptive, predictive and prescriptive
analysis, achieving a result that supports and demonstrates business decision-making as
well as organizational performance.
As we can see, the definition of BA, as well as the definition of the other concepts
referred to, depends on the consulted literature. Even so, we can say that the consulted
literature reaches an agreement over the definition of the BA concept, assuming that it
consists on transforming the collected data into useful information, capable of sup-
porting the decision-making process of an organization in order to exponentiate the
performance of their business and to create a competitive advantage for it.
2.2 Visualization
The concept of Information Visualization is defined as “The use of computer-
supported, interactive, visual representations of abstract data to amplify cognition”
(Card et al. 1999). The main purpose of information visualization is to improve
understanding of the data with graphical presentations availing the powerful image
processing capabilities of the human brain. This technique extends the working
memory; reduces the search of information and enhances the recognition of patterns,
increasing the human cognitive resources (Järvinen et al. 2009).
As previously mentioned, today, organizations are increasingly subject to handling
large amounts of data. Their understanding reflects directly on the organizations’
success, as a good understanding of data turns it into useful information, capable of
generating competitive advantage over other organizations on the same business.
However, a good understanding of the data depends on how it is presented. This is
precisely where the concept of Visualization comes in, which bridges the gap between
data and knowledge. Human vision contains millions of photoreceptors capable of
recognizing patterns (Ware 2004), so the visual representation of data is the most
favourable technique for understanding data and acquiring knowledge.
In short, Visualization aims to improve data comprehension through the visual
representation of data supported by visual analytics technological tools, such as
dashboards, and the visual and intellectual abilities of the human being.
The dashboard is a visual analytics tool that allows the user to visualize the most
important information needed to achieve one or more objectives, consolidated and
organized in a single screen, allowing the quick monitoring of it (Few 2004).
In order to allow easy visualization and a good understanding of the data, Few
(2004) states that firstly a dashboard must present a high-level, general, simple and in-
depth view focused only on the exceptions to be reported immediately to inform the
user of what is happening without specifying why it is happening. Secondly, this high-
level view should emphasize the aspects and variables that, through its visualization,
communicate useful information for decision-making, to further analyze it in more
detail. Finally, Few (2004) considers that the dashboard should also allow easy drill
down within the dimensions and metrics underlying the useful information from the
previous point, allowing a more detailed analysis of it.
2.3 Business Analytics for Social Healthcare Institution

A Business Analytics (BA) component aims, in a first phase, to transform data into
information to later generate knowledge. As stated earlier, organizations today handle
large amounts of data. With the absence of a BA component responsible for the data
integration and consolidation process, organizations end up disregarding the data that,
by going through this process, could turn into information and, consequently, into
useful knowledge for the organization’s business process.
As stated before, in general, the literature about BA components focuses more on
their usefulness in industrial organizations. However, healthcare institutions are not far
behind when it comes to reaping the benefits of this technology to improve their
business process.
According to the Portuguese Directorate-General of National Health, having real-
time access to useful healthcare information enables patient prioritization, adjust
resources and services, continuous monitoring and evaluation, minimization of errors,
predictive analysis and knowledge generation (DGS 2017).
The Brazilian Computer Department of the Unified Health System also adds that
knowledge of the health situation is essential for healthcare management. This infor-
mation, if easily accessible and available with quality, becomes a great aid to decision-
making in any sector of activity (DATASUS 2004).
In short, the components of BA are directly related to information management,
whether from industrial organizations, healthcare institutions or other sectors of
activity.
The use of this type of components in healthcare institutions has been increasing in
a significant way over the last years. The main factors that justify this phenomenon are
related to the institutions’ increasingly need to improve resource management and cost
structure; improve the quality of health care provided; improve satisfaction perceived
by patients; comply with legal regulations underlying their business process; and attract
and retain talent from professionals involved in health care (Quintela 2013).
Thus, the BA components, in addition to providing more accurate decision-making
information, also appear as a solution for healthcare institutions to accomplish cost
reduction, increase the quality of healthcare services provided and assure their future.
3 Case Study
Within the scope of the nature of a Social Healthcare Institution and to certify its
Quality Management System with the ISO 9001 standard, the need arose to develop a
Business Analytics (BA) solution focused on quality management that allows effi-
ciency on health care provision in order to strengthen a set of organizational principles
required by ISO 9001. We can identify as examples of these principles the customer
focus and the use of tools that allow top managers to execute the organization’s
processes with efficiency in order to make appropriate decisions that promote the
continuous improvement of the organization.
So, to achieve the BA solution’s requirements mentioned above, it was necessary to

study the business processes of the healthcare institution as well as the ISO 9001
standard to perceive the institution’s goals in order to support the Key Performance
Indicators’ methodology (KPIs), and therefore to understand which business indicators
are relevant to the healthcare institution’s business.
The team responsible for the development of the KPIs for this project is constituted
up by members of the University of Minho assigned to this project and by members of
the social health institution, notably its top managers.
The initial proposals for KPIs will, of course, be imperfect. Therefore, the team
needs to understand what ISO 9001 is and what role it plays in a social health insti-
tution in order to develop relevant KPIs in this business area.
After some meetings and discussions, the KPIs development team concluded that
clients and services are the key areas of the institution to be studied.
According to the ISO 9001 quality standard, customer focus should be in the
interest of an organization. The key requirements of this principle include under-
standing stakeholder needs and expectations; understanding quality objectives and
planning to achieve them; and customer satisfaction. The continuous improvement at
different levels, such as the products/services provided, activities and organizational
processes, is another one of the relevant principles for this institution. Key require-
ments of this principle include, for example, actions to address risks and opportunities,
monitoring, measurement, analysis, evaluation and improvement.
Thus, the team assumes that this institution has the following “business” objectives:
• Improve user satisfaction;
• Improve the quality of the provided services (continuous improvement);
• Improve/Optimize organizational activities;
Thus, a set of dashboards were developed to be part of the BA component to
support the monitoring of the identified indicators that expose the business performance
of the social healthcare institution.
After some meetings with the social health institution managers, the indicators and
dashboards were redefined according to the institution’s managers preferences. This led
us to an agreement about what should be displayed on the BA component dashboards,
and therefore it led us to the final solution of the BA component.
The following pictures will show two out of the eleven developed dashboards. The
first one focuses on the surgery waiting list data. By looking at it, we can view the
number of operated patients and the number of patients on the waiting list. We can also
filter this same information by time frame, medical speciality, situation (surgery status)
and surgery type (Fig. 1).
Fig. 1. Waiting list.
This information helps the institution to analyze the efficiency of their medical
services by monitoring the number of patients waiting for their surgery to be accom-
plished, and also to examine which of the medical specialities has the longest queues.
This allows the institution to manage speciality queues by, for example, increasing the
number of doctors in a speciality with a large number of patients on the waiting list,
which contributes to satisfying the three identified critical success factors, and con-
sequently the fulfilment of business objectives.
The second and last dashboard to be presented in this article offers a view over the
performed surgeries. Its consultation allows us to observe the number of operated
patients by gender, by age group and per day of the week. This same data is also
subject to filters related to the time frame, medical speciality and surgery type (Fig. 2).
Fig. 2. Performed surgeries.

Having prompt access to this information allows the institution to monitor the
number of surgeries performed on a time gap and to identify the day of the week where
most of the surgeries are performed in order to support the schedule of surgeries and to
reduce the number of patients that are waiting for their surgeries to be accomplished.
Therefore, this dashboard also helps the institution to achieve its goals.
4 Conclusion
As stated before, this institution’s business goals are to improve user satisfaction,
improve the quality of the provided services (continuous improvement), and
improve/optimize organizational activities. Through the Key Performance Indicators’
Methodology, the respective critical success factors were exposed, such as reducing
waiting days for services, ensuring sufficient clinical professionals and increase the
number of performed surgeries.
These factors led to the definition of key performance indicators that monitor their
success, and therefore, the attainment of business goals. Once understood, these
indicators support the dashboards conception, which later supports the monitoring of
the same indicators in a much more interactive and intuitive way, namely through
dashboards visualization. This helps the institution to make better decisions promptly
that lead to the achievement of goals. Once the institution achieves their goals, they are
ready to be certified by ISO 9001.
Acknowledgements. The work has been supported by FCT – Fundação para a Ciência e
Tecnologia within the Project Scope UID/CEC/00319/2019.
References
DATASUS: Notebook Presentation (2004). http://datasus.saude.gov.br/apresentacao-caderno
DGS: Healthcare Dashboards: Past, Present and Future. A perspective of evolution in Portugal
(2017)
Eckerson, W.: A Practical Guide to Advanced Analytics – Ebook (2011)
Few, S.: Dashboard Confusion (2004)
Järvinen, P., Puolamäki, K., Siltanen, P., Ylikerälä, M.: Visual analytics (2009)
Schniederjans, M.J., Schniederjans, D.G., Starkey, C.M.: Business Analytics – Principles,
Concepts and Applications. Pearson Education, New York (2014)
Quintela, H.: Magazine dos Sistemas de Informação na Saúde (2013)
Turban, E., Sharda, R., Aronson, J., King, D.: Business intelligence - a managerial approach
(2008)
Ware, C.: Information Visualization: Perception for Design. Morgan Kaufmann, Burlington
(2004)
Step Towards Monitoring Intelligent
Agents in Healthcare Information
Systems
Regina Sousa, Diana Ferreira, António Abelha, and José Machado(B)
ALGORITMI Research Center, School of Engineering, University of Minho,

Gualtar Campus, 4710-057 Braga, Portugal
Abstract. A platform for establishing interoperability between hetero-

geneous information systems implemented in a hospital environment is
more a requirement than an option. The Agency for the Integration, Dif-
fusion and Archiving of Medical and Clinical Information (AIDA) is an
interoperability platform designed specifically to address the problem of
integrating information from multiple systems and addressing interoper-
ability, confidentiality, integrity and data availability. This article focuses
on the relevance and need for such vigilance, finding and designing effec-
tive new ways to establish them. This study culminated in the creation
of AIDAMonit, a surveillance platform developed and tested by ALGO-
RITMI Center researchers, which has shown promise and is extremely
beneficial for the well-functioning of the health facilities currently using
the AIDA platform.
Keywords: Health Information Systems (HIS) · Intelligent agents ·

AIDA · Interoperability · Monitoring · AIDAMonit
1 Introduction
Thinking about today’s society, everything around it involves technology. The
idea that the human being has changed with the technological evolution is a little
bit frightening, but perhaps it is the most realistic thinking of today. Over the
years Information Technology (IT) has been profoundly embedded in society, to
the point that it has dramatically altered mankind’s way of thinking and living
in such a way that every day-to-day activity depends on the proper functioning
of technologies.
In recent years, IT has emerged in several areas and healthcare is no excep-
tion. IT is a very broad concept with applicability in many industries. Thus,
the clarification of the term is important for the acceptance of its use by insti-
tutions and its professionals. So, IT is the set of all activities, solutions and
human and/or computational resources that allow access, consultation, manage-
ment and use of information. Part of the success of IT, in the health area, has
Health Information Systems 511
to do with the correct acting of Health Information Systems (HIS). They are
responsible for the acquisition, processing, and presentation of all information
about the institution and its services. For this, they have tools to improve the
care-giving in an efficient and sustainable way [24].
In the healthcare area, the treatment and processing of information has
changed quite a lot, from the traditional way of recording all the information in
paper to its electronic register. Therefore, nowadays the HIS are already intrinsic
or even determinant in the success of the hospital delivery of care. The use of
this type of systems is not only beneficial for patients, but also for health profes-
sionals throughout the healthcare community, since it will ease several everyday
responsibilities involved in their work [10].
It is imperative to make quick and quality decisions in the health sector
as these are almost always related to human life. Therefore, medical decision-
making needs to integrate the best available evidence with the experience of
clinical professionals and the specific values regarding the patient health status
[9,13]. Often, in the absence of timely access to high-quality information or
even when facing difficulties in constructing a functional historic process of the
patient in question, the health professionals are obligated to make decisions
based solely on their experience and intuition without considering the facts and
information required [1,9]. Obviously, without appropriate access to relevant
data, practically no decisions on procedures, diagnostic, therapy, and others can
be made without occurring medical errors or other problems, which may result
in fatal consequences for patients. This difficulty can be overcome through the
implementation of Clinical Decision Support Systems (CDSS), which are based
on medical knowledge to assist clinicians in the elaboration of diagnoses and in
the decision-making of therapies through the analysis of patient specific clinical
variables [23].
In a more technical perspective, CDSS can retrieve relevant documents, cre-
ate and send alerts or recommendations to patients or healthcare professionals,
organize and present relevant information on dashboards, diagrams, documents,
and reports, in order to ease, speed up and improve the clinical decision-making
[7,19]. Accordingly, these systems should consider information from various sys-
tems and platforms implemented in the health institution that due to their
diversity constitute another weakness. Thus, the primary objective for the solu-
tion of all these problems is the implementation of interoperability platforms
in an effective way. These platforms should be based on intelligent agents that
interact with each other and organized in robust and efficient architectures, so
that the access and interpretation of the information is almost immediate.
The remainder of the paper is organized as follows: Sect. 2 includes a brief
description of intelligent agents, from its definition to the advantages of its indi-
vidual or multi-agent use. The following section, Sect. 3, describes the AIDA
platform, its main characteristics, operation, architecture and vulnerabilities.
Subsequently, Sect. 4 explains the worth, the significance and the impact of mon-
itoring computational applications, focusing on the description of the proposed
solution, AIDAMonit, a platform for efficiently monitoring the behavior of the
512 R. Sousa et al.
intelligent agents that make up the AIDA platform. Finally, Sects. 5 and 6 dis-
closes the proof of concept and the main conclusions as well as some perspectives
for future work.
2 Intelligent Agents and Multi Agent Systems

Intelligent agents have been popular since the middle of 1990, and at this time
were defined as a paradigm that represented “the next significant breakthrough
in software development” (Sargent 1992). Since then, the concept and application
of this invention has been extended and is now widely used in various fields of
science and research [12].
Agents must be capable of operating effectively in dynamic environments
where they interact and cooperate with each other. To achieve this, architectures
must allow agents to be compatible in order to communicate with other agents
and offer services to one another [6]. As the name implies, a Multi Agents System
(MAS) consists of two or more agents that have the ability to act autonomously
at the same time as they interact with each other. In this interaction, com-
munication protocols are needed so that the agents can efficiently cooperate,
negotiate, compete and coordinate [4,11,14,18].
These systems aim to construct and implement robust logic models, capable
of resolving large-scale problems. Thus, and since each agent will simultaneously
have autonomous activities and asynchronously cooperate with other agents,
it is necessary to implement an architecture so that communication, made by
the Internet or local network, can exist. The architecture of these should be dis-
tributed, but as simple as possible, to ensure communication and interoperability
between different applications [23]. In the hospital environment, the MAS stood
out due to their ability to solve problems since they communicate and operate
for common goals.
The AIDA platform was developed according to these principles, this subject
will be discussed in Sect. 3.
3 AIDA
In Healthcare, access to information in a fast and effective way, is a determinant
factor for the reduction of medical errors and the consequent improvement of
the care provided. However, as much as desired this goal is, it has not yet been
achieved much due to the individuality and heterogeneity of the different health
information systems. Although these systems increase the quality of the health
services, they are developed in an isolated way, failing in the capacity to interact
together effectively.
IEEE defines interoperability in healthcare as the “ability to communicate
and exchange data accurately, effectively, securely and consistently with dif-
ferent information technology systems, software applications, and networks in
various settings and exchange data such that clinical or operational purpose and
meaning of the data are preserved and unaltered” [25]. The benefits of imple-
menting interoperability in healthcare facilities and the consequent homogeneity
among HIS are countless. Such benefits include better information quality by sin-
gle patient identification, time reduction in diagnostic and appointments, since
physicians have access to relevant information whenever and wherever they need
it, a correct association between all the information systems and, consequently,
collaboration at local, regional, national and international level.
The process of implementing interoperability in health organizations is even
more difficult because each specialty has its own particularities as well as different
methods. Interoperability among systems is one common and comprehensive
interest within the entire scientific community. In recent years, the group of
Artificial Intelligence (AI) of the University of Minho dedicated itself to building
a platform to answer all these needs, the AIDA.
The Agency for Integration, Dissemination and Archiving of medical informa-
tion (AIDA) is the result of many research partnerships between the University of
Minho and several Portuguese health units, including the Centro Hospitalar Uni-
versitário do Porto (CHUP). AIDA is a complex system consisting of specialized
and straight forward intelligent agents, that seeks the integration, dissemination
and archiving of large volumes of data from heterogeneous sources (e.g. comput-
ers and medical devices) and the uniformity of the clinical systems [8,16,17,21].
This platform was designed to aid medical applications, their main subsystems,
and functional role, as well as to control the entire flow of information through a
network of intelligent information processing systems. AIDA uses a multi-agent
architecture of the type service-oriented architecture (SOA) to ensure interoper-
ability between various information systems [3,4,15,20]. AIDA is implemented
in five health institutions throughout Portugal and has a paramount influence
in the quality of the services provided by the healthcare professionals, and it
is already installed and in use in five health institutions. Accordingly, all its
components must have a form of monitoring and prevention of failure, so that
AIDA is available 24 hours a day, every day of the year, to ensure efficient health
care delivery. This allows to implement interoperability in a distributed environ-
ment according to different types of agents that have very distinct scopes and
functions, inside the platform.
The systems that constitute the platform are:
– Integrated Hospital Information System (SONHO) - Emerged in the decade

of 90 with the philosophy that a patient only has a unique identification
number. It is defined as an integrated hospital information system whose
main objective is to support the hospital administrative services, focusing on
the display, generation and archiving of information that will be exported for
statistical purposes [2].
– Sclinical Hospital - A system that arises from the combination of two smaller
ones, the Medical Support System (SAM) and the Nursing Practice Support
System (SAPE). This merge occurred in 2013 and resulted in an applica-
tion centered on the end-user, which can be used by all health professionals,
without any division between medical and nursing information.
514 R. Sousa et al.
– Electronic Clinical Process (PCE) - Emerged in 2007 and is a repository of

the patient’s clinical history. The records have a format already prepared to
be processed by computers using the Health Level Seven (HL7) protocol.
Not only the structure of the messages and the type of fields contained in
them are necessary for the complete understanding of a message. That is, for the
absence of ambiguity, also the meanings, the context and the relations between
the different terms must be known and used by both parties in communication. In
health institutions, standards are considered the main source for ensuring inter-
operability between HIS. The HL7 protocol is perhaps the most internationally
recognized and is a major contributor to interoperability in health facilities. HL7
is a set of standard formats that define a message structure to exchange infor-
mation between different heterogeneous hospital applications [5,15,22]. In short,
this is used to enable communication from application to application through
well-established messages. There are several message templates, each with its
own structure and fields. Each message has its own structure and consists of an
accumulation of multiple threads that represent a logical grouping of data fields.
The security of the AIDA platform is fundamental because it is a platform
associated with healthcare, and, consequently, must be available 24 hours a day,
every day of the year. Currently, it is installed in five Portuguese hospitals,
including the CHUP, and even a short period of shutdown can bring serious
and devastating consequences to the health organization, either directly in the
management of resources, and/or indirectly in the quality of the services pro-
vided and consequently in the health status of the different patients. Therefore,
the prevention of failures as well as the monitoring of the AIDA platform is
indispensable and of extreme value to the health institutions.
4 Monitoring System - Platform AIDAMonit
The concept of monitoring, according to the Portuguese dictionary presup-

poses “Accompany through a monitor”, “supervise”, “evaluate” or even “Look
carefully”. However, for computer systems, monitoring presupposes continu-
ous supervision. Regarding intelligent agents, that supervision shall ensure that
agents are active and perform their duties efficiently. Efficient monitoring of a
given system is only possible if the knowledge about it is consistent and complete.
Thus, to achieve successful monitoring, it is necessary to conduct a careful study
of the system to identify its greatest vulnerabilities and even the most likely fail-
ures to occur. With this identification it is possible to develop procedures to
prevent the occurrence of failures.
Concentrating on the main objective, that is, the prevention of system fail-
ures, the first step is, in fact, an efficient monitoring process. This process should
collect all data relevant to the knowledge of the system and ultimately the pre-
vention of failures. In addition, there may be a middle layer between system
monitoring and fault prevention to predict failure. To obtain and process infor-
mation, the monitoring cycle should move from collecting and storing data in the
database to processing and discarding information on a platform or something

similar.
Over the years, several systems have been developed to monitor databases,
machines and even intelligent agents of the AIDA platform. But a large part
does not perform continuous monitoring. Instead, there is periodic monitoring,
for example, every 10 min. Therefore, considering that monitoring can directly
improve resource management and indirectly patient care, it would be highly
beneficial to perform it in real time. To this end, the authors have developed a
platform for the monitoring of intelligent agents, in real time and continuously.
The main purposes are to monitor the activity of the various agents that con-
stitute the AIDA platform, to quickly detect errors and inconsistencies, as well
as to identify agents that take longer than usual to perform their function. In
addition, the platform allows to monitor individually (by agent) or collectively
(by server) and will also present various statistical data to those who know how
to improve agent performance.
The developed monitoring platform features a three-base architecture, as
it shows on Fig. 1: the database (where the necessary information is stored), a
restful api web service programmed in JavaScript, using the Node.JS interpreter,
and a browser-accessible client interface developed in ReactJS, a library also in
JavaScript. The process starts with a request by the client (User Interface),
depending on this request a connection to the database is made through the
call. After the connection is successful, it is possible to make queries, in SQL to
select the desired data. Finally, the web service transforms the data obtained in
the json format and these are sent to the customer.
Fig. 1. Web application architecture.
This architecture results in a platform composed of a side menu that gives

access to two different modules: SIL and HL7, which are services of the AIDA
platform. Figure 2 shows the general layout of the web platform, in which each
516 R. Sousa et al.
of the services has a division of four sub-modules - Panels, Tables, Agents and
Servers. Each module is essential for the proper monitoring, detection and cor-
rection of any errors that may occur in the agent’s software, improving the
provision of hospital services, as well as facilitating the daily work of the pro-
fessionals who have the task of monitoring the continuous behavior of AIDA’s
intelligent agents.
Fig. 2. Global view of the web application interface.
The development of AIDAMonit culminates in the construction of a work

environment, full of interactive panels, charts, tables and statistics so that the
hospital system information team can accurately consult and manage the activi-
ties of the intelligent agents that make up the AIDA platform. Perhaps even more
important, this team will be able to observe the occurrence of errors in a timely
manner, think and perform ways to resolve them quickly, as well as anticipate
their occurrence In order to increase the performance of this technology.
The modules of both services are quite similar, with only distinct sources
of information. Therefore, in the dashboards module all the information prop-
erly processed, in the form of graphs, is concentrated so that the analysis is
quick and effective. The other modules contain detailed information available
for consultation.
5 Proof of Concept
Any research project that is aimed at its implementation must pass a proof of
concept where questions such as “Is this technology needed?” And “Who will
use this technology?” are paramount to its success.
Therefore, the platform was submitted to a proof of concept called SWOT

analysis. As the name implies, this test evaluates Strengths, Weaknesses, Oppor-
tunities and Threats to prove both the viability and usability of the developed
solution. Thus, the presented technique evaluates both the internal environment
and the external environment of a solution or even of the organization.
Table 1 presents the SWOT analysis performed for the developed platform.
Table 1. SWOT analysis
Parameter Analysis
Strengths - Centralization of information
- Fast error detection
- Centralized activity history
- High usability
- Easy maintenance
- High scalability
- Ease of adaptation and evolution
Weaknesses - Dependency on CHUP’s internal network
- Complexity in historical research, namely in dates
- Delay in executing complex requests
Opportunities - Construction of indicators that allow detection of error patterns
- Direct connection to agent software for possible error correction
- Imminent need for smart agent monitoring
- Improve the quality and effectiveness of CHUP services
Threats - Modification of the structures or databases that feed the application
- User rejection of new technologies
- Internet network connectivity issues
- Competition with new technological innovations that may appear
6 Conclusions
The implementation of interoperability issues in healthcare institutions is quite

challenging due to the heterogeneity of the tasks, activities, information systems
and health professionals involved, the diversity of organizational structures, and
the complexity and difficulties in adopting and managing changes in hospital
settings. AIDA was developed to allow the dissemination and integration of
information generated in the healthcare environment and is currently being used
by several Portuguese hospitals, including CHUP. The institutions that use the
AIDA platform have already tackled many of their interoperability problems,
since this is a platform responsible for interconnecting the services 24 hours
a day, every day of the year. Thus, the occurrence of a failure in any of the
agents that make up the AIDA platform brings severe costs to the institution
and eventually to patient’s life. Hence, the AIDA platform needs parallel systems
518 R. Sousa et al.
to predict and avoid failures as well as to monitor the activity of the intelligent
agents.
In this sense the developed platform monitors the behavior of the intelligent
agents that constitute of AIDA platform. The monitoring platform responds to
several requirements such as:
– Real-time monitoring of intelligent agents (individually and collectively);
– Exhibition of statistical metrics for consultation and knowledge construction;
– Consultation of past events using date filters;
– Extraction of relevant insights about agent’s behavior through charts and
dashboards;
– Identification of root causes of poor performance, errors and inconsistencies.
With this platform, managers will be able to ensure the proper function-
ing of the intelligent agents that make up the AIDA and, consequently, ensure
excellence in the provision of healthcare to the patient. ReactJS, a JavaScript
library for building user interfaces, was chosen to give body and shape to the
platform, as it is a modern and powerful tool that is taking over the frontend
development because of its fast rendering due to the existence of a virtual DOM
and the ability to reuse and combine components. The backend of the platform
is in NodeJS and ensure the connection between the Oracle database and the
interface.
Acknowledgments. This work has been supported by FCT - Fundação para

a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2019 and
DSAIPA/DS/0084/2018.
References
1. Brandão, A., Pereira, E., Esteves, M., Portela, F., Santos, M., Abelha, A.,
Machado, J.: A benchmarking analysis of open-source business intelligence tools
in healthcare environments. Information 7, 57 (2016). Simulation and Real Time
Applications (2013)
2. Cardoso, L.: Desenvolvimento de uma Plataforma baseada em Agentes para a
Interoperabilidade (2013)
3. Cardoso, L., Martins, F., Portela, F., Santos, M., Abelha, A., Machado, J.: A multi-
agent platform for hospital interoperability. In: Ambient Intelligence - Software and
Applications Advances in Intelligent Systems and Computing, pp. 127–134 (2014)
4. Cardoso, L., Martins, F., Portela, F., Santos, M., Abelha, A., Machado, J.: The
next generation of interoperability agents in healthcare. Int. J. Environ. Res. Public
Health 11, 5349–5371 (2014)
5. Cardoso, L., Martins, F., Quintas, C., Portela, F., Santos, M., Abelha, A.,
Machado, J.: Interoperability in healthcare. In: Cloud Computing Applications
for Quality Health Care Delivery Advances in Healthcare Information Systems
and Administration, pp. 689–714 (2014)
6. Cardoso, L., Martins, F., Quintas, C., Portela, F., Santos, M., Abelha, A.,
Machado, J.: Interoperability in healthcare. In: Health Care Delivery and Clin-
ical Science, pp. 689–714 (2018)
7. Castaneda, C., Nalley, K., Mannion, C., Bhattacharyya, P., Blake, P., Pecora, A.,
Goy, A., Suh, K.S.: Clinical decision support systems for improving diagnostic
accuracy and achieving precision medicine. J. Clin. Bioinform. 5, 4 (2015)
8. Duarte, J., Salazar, M., Quintas, C., Santos, M., Neves, J., Abelha, A., Machado,
J.: Data quality evaluation of electronic health records in the hospital admission
process. In: 2010 IEEE/ACIS 9th International Conference on Computer and Infor-
mation Science (2010)
9. Foshay, N., Kuziemsky, C.: Towards an implementation framework for business
intelligence in healthcare. Int. J. Inf. Manag. 34, 20–27 (2014)
10. Haux, R.: Health information systems - past, present, future. Int. J. Med. Inform.
75, 268–281 (2006)
11. Isern, D., Sanchez, D., Moreno, A.: Agents applied in health care: a review. Int. J.
Med. Inform. 79, 145–166 (2010)
12. Jennings, N.R., Wooldridge, M.: Applications of intelligent agents. In: Jennings,
N.R., Wooldridge, M.J. (eds.) Agent Technology. Springer, Heidelberg (1998)
13. Lenz, R., Reichert, M.: IT support for healthcare processes - premises, challenges,
perspectives. Data Knowl. Eng. 61, 39–58 (2007)
14. Machado, J., Abelha, A., Neves, J., Santos, M.: Ambient intelligence in medicine.
In: 2006 IEEE Biomedical Circuits and Systems Conference, pp. 95-97 (2006)
15. Machado, J., Abelha, A., Novais, P., Neves, J., Neves, J.: Quality of service in
healthcare units. Int. J. Comput. Aided Eng. Technol. 2, 436 (2010)
16. Machado, J.M., Miranda, M., Gonçalves, P., Abelha, A., Neves, J., Marques, J.A.:
AIDATrace - Interoperation Platform for Active Monitoring in Healthcare Envi-
ronments. ISC, Eurosis (2010)
17. Martins, F., Cardoso, L., Esteves, M., Machado, J., Abelha, A.: An agent-based
RFID monitoring system for healthcare. In: Advances in Intelligent Systems and
Computing, pp. 407–416 (2017)
18. Miranda, M., Pontes, G., Abelha, A., Neves, J., Machado, J.: Agent based inter-
operability in hospital information systems. In: 2012 5th International Conference
on Biomedical Engineering and Informatics (2012)
19. Musen, M.A., Middleton, B., Greenes, R.A.: Clinical decision-support systems. In:
Biomedical Informatics, pp. 643–674 (2014)
20. Peixoto, H., Santos, M., Abelha, A., Machado, J.: Intelligence in interoperability
with AIDA. In: Lecture Notes in Computer Science, pp. 264–273 (2012)
21. Pontes, G., Portela, C., Rodrigues, R., Santos, M., Neves, J., Abelha, A., Machado,
J.: Modeling intelligent agents to integrate a patient monitoring system. In: Trends
in Practical Applications of Agents and Multiagent Systems, pp. 139–146 (2013)
22. Rodrigues, R., Gonçalves, P., Miranda, M., Portela, F., Santos, M., Neves, J.,
Abelha, A., Machado, J.: Monitoring intelligent system for the Intensive Care Unit
using RFID and multi-agent systems. In: 2012 IEEE International Conference on
Industrial Engineering and Engineering Management (2012)
23. Shojania, K.G., Duncan, B.W., McDonald, K.M., Wachter, R.M., Markowitz, A.J.:
Making health care safer: a critical analysis of patient safety practices. Evid. Rep.
Technol. Assess. (Summ.) 43, 668 (2001)
24. Taylor, S., Todd, P.A.: Understanding information technology usage: a test of
competing models. Inf. Syst. Res. 6, 144–176 (1995)
25. Tolk, A.: Interoperability, composability, and their implications for distributed sim-
ulation: towards mathematical foundations of simulation interoperability. In: 2013
IEEE/ACM 17th International Symposium on Distributed Simulation and Real
Time Applications (2013)
Network Modeling, Learning and
Analysis
A Comparative Study of Representation
Learning Techniques for Dynamic
Networks
Carlos Ortega Vázquez1(B) , Sandra Mitrović1 , Jochen De Weerdt1 ,

and Seppe vanden Broucke1,2
1
Research Center for Information Systems Engineering (LIRIS), KU Leuven,
Leuven, Belgium
[email protected]
2
Department of Business Informatics and Operations Management,
Ghent University, Ghent, Belgium
Abstract. Representation Learning in dynamic networks has gained

increasingly more attention due to its promising applicability. In
the literature, we can find two popular approaches that have been
adapted to dynamic networks: random-walk based techniques and graph-
autoencoders. Despite the popularity, no work has compared them in
well-know datasets. We fill this gap by using two link prediction settings
that evaluate the techniques. We find standard node2vec, a random-walk
method, outperforms the graph-autoencoders.
Keywords: Dynamic networks · Representation learning
1 Introduction
Network analysis has increasingly gained attention both in academia and indus-
try because it offers a framework that analyzes interrelationships within natu-
ral structures: we find applications in churn prediction [7,20], crime detection
[26,27], recommendation systems [14]. However, network analysis traditionally
requires extensive preprocessing: data analysts have relied on handmade feature
engineering based on expert knowledge or summary statistics (e.g. clustering
coefficients) [17]. Despite the popularity of ad-hoc feature engineering, it lacks
flexibility and requires extensive domain knowledge [12,14,20].
One response to traditional feature engineering is representation learning
(RL); it is sometimes referred as feature learning. RL aims at finding a low-
dimensional representation or embedding of the data so further downstream
tasks become more automatic [3]. However, most early techniques in RL can only
handle static networks [12,16,22,28]. In contrast, a real-world network displays
dynamic processes that changes its topological structure [25]. Recent techniques
for RL in dynamic graphs have relied on random walks [8,21,24], autoencoder
524 C. Ortega Vázquez et al.
[10,11], or matrix factorization in a lesser extent [18]. However, to the best

of our knowledge, no work has compared graph autoencoder and random-walk
techniques. Furthermore, no work has compared RL techniques for link predic-
tion both interpolation (i.e. finding missing links) and extrapolation (i.e. predict
future networks) settings.
Our contribution is twofold for RL in dynamic networks. First, we propose an
experimental setup that evaluates RL techniques in how effective they recover
missing links (i.e. interpolation setting) and predict future version of graphs
(i.e. extrapolation setting). Second, we use a bayesian approach of word2vec
[1] for representation learning in graphs. A bayesian approach to RL provides
insights due to its probabilistic nature [2]. We evaluated the RL techniques in
two well-known datasets in the literature of dynamic networks: Facebook forum
and Enron.
2 Related Work
The literature on RL for graphs have diversified into several lines of research
[6,29]: for network transductive tasks, dynamic RL exploits the topological evo-
lution [10,20,24] while inductive RL leverages extra information for unseen nodes
[13,27]. We can categorize RL techniques in dynamic networks regarding time
granularity [9]: some methods handle discrete time and others, with continuous
time evolution [21]. We focus on the former as more works have been developed
in that line of research.
We consider two main types of RL techniques in dynamic networks that are
relevant for this study: random walk and graph-autoencoder approaches. On the
one hand, the random walk techniques, related to shallow embedding methods
[13], learn the network embedding based on the nodes that co-occur on random
walks. One strong merit of random-walk techniques is the use of a stochastic
similarity measure (e.g. co-occurrence in random walks) which leads to lower
complexity compared to deep learning approaches. However, these techniques
require fine-tuning of the random walks. On the other hand, graph-autoencoder
techniques leverage the adjacency matrix that captures non-linear relationship in
the node neighbourhood [28]: similar neighbourhood leads to similar embedding.
Compared to the random-walk approach, graph-autoencoders can reconstruct
the whole graph since they learn from its adjacency representation; dependence
on the adjacency matrix also constrains its applicability in large real-world net-
works [20].
Both approaches originally handle static networks [12,28] so recent works
developed extensions for dynamic networks. Most of the techniques depends
either on reusing parameters in each time-step [10,11,19] or aligning the static
embeddings [8,24]: However, these efforts lack theoretical foundation. The
Bayesian framework considers a prior probability distribution that can allow
a smoother drift of the network embedding across time. Bayesian word embed-
dings have been explored in NLP [1,2,5]: Bayesian word embeddings offer noise
robustness by time slices and uncertainty measurement via density. Despite its
Representation Learning in Dynamic Networks 525
theoretical benefits, no work in RL for networks has explored this direction. We

implement the approach taken in [1] as a first step towards filling the gap for
Bayesian RL techniques in dynamic networks.
3 Methodology and Experimental Setup

Current literature considers both interpolation and extrapolation settings,
although not at the same extent. Most of the literature focuses on the interpola-
tion setting, also known as the completion problem [9]. The completion problem
for the missing links in a graph is posed as: given G = {G1 , .., GT }, we want to
predict links among the graphs. The RL techniques for static graphs mostly rely
on the single time-step of the interpolation setup [12]; however, it is also possible
to use interpolation for dynamic graphs [8]. In contrast, extrapolation requires
predicting links from further time-steps beyond GT ; prediction of a graph as a
set of links can be considered as extrapolation setting. Few works address the
extrapolation setup [10,11,30] since it can be more challenging. For this work,
we consider both interpolation and extrapolation setting.
The literature lacks of comparison of RL techniques in dynamic networks
that are based on orthogonally different approaches. For this study, we compare
two approaches for RL techniques: random-walk and graph-autoencoder. We use
the standard node2vec [12], and a network adaptation of the dynamic Bayesian
word embeddings [1]. From the graph-autoencoder techniques, we consider Dyn-
GEM [11] and dyngraph2vecAE (dynae) [10]. Both techniques generate embed-
dings from the bottleneck layer in the autoencoder. The techniques optimize
the embedding in order to preserve the local and global structure: the relevant
hyperparameters that represent the trade-off between preserving local and global
structure are α and β. The key difference between these graph-autoencoder is
that DynGEM exploits the adjacency matrix from a previous time-step while
dynae can leverage more previous time-steps.
Our adaptation for Bayesian embeddings (dynbae) requires random walks,
instead of text, as input so we must sample the network as in node2vec: thus,
the same sampling parameters are present (e.g. return parameter p and in-out
parameter q). This dynamic approach is based on the previous Bayesian Skip-
gram model [2], but a Kalman Filter [15] is added to allow dynamic processes.
The Bayesian skip-gram model uses a Gaussian prior to formulate the posterior
probability of the embedding. The posterior is computed through Variational
Bayes [4] which differs from the standard Skip-gram optimization. The result
is that the technique can map nodes into probability densities instead of point
estimates.
Two network datasets are used for the experimental setup; all of them
are available on Network Repository (http://networkrepository.com) [23] where
open access is given to different dynamic networks. Table 1 describes some
descriptive statistics for each of the datasets. The Facebook Forum dataset dis-
play high density, considered as a dense network, which can be reflected in the
high average degree of 15.68. The other datasets can be considered as sparse
Table 1. Descriptive statistics
Facebook forum Enron

Number of nodes 899 135
Number of edges 7046 135
Average degree 15.68 10.96
Average clustering coef. 0.0637 0.4889
Number of connected components 1 3
Degree assortativity coef. −0.1083 −0.1490
Density 0.01746 0.0818
Number of snapshots 6 8
Size interval of snapshots Month 2 Months
networks given their low density. The Enron dataset is also a dense network
but it contains substantially less nodes and edges. However, the dataset shows
a higher average clustering coefficient.
First, we need to split the datasets into different snapshots. The choice of the
time frame size follows as in [8]. For each snapshot, we have an edge list that
represents a network. The test snapshot in both interpolation and extrapolation
settings derive from the last time-step GT . For the interpolation approach, we
randomly divide into two sets for the subsequent downstream task: 70% of the
edges are used for training and 30%, for test. For the extrapolation approach,
we use the whole GT for evaluation. Additionally, we sample non-edges as many
as the edges so the datasets are balanced in all snapshots. We only use previ-
ously seen training nodes so we extract a subgraph from GT that complies with
this requirement. All training snapshots share information of the training nodes
even if they are not active in a particular snapshot (i.e. a node without any
link to others). We follow the approach in [12] to get the edge features from
the node embeddings: the Hadamard operator is used to combine a couple of
node embeddings into one vector for representing edges. At the end, we obtain a
matrix of edge features: for the extrapolation setting, embeddings for each snap-
shot are stacked vertically while for the interpolation setting, embeddings are
stacked horizontally. Subsequently, a classifier can learn from the edge features
for predicting in the corresponding test set if the two nodes hold an edge. Three
well-known classifiers in link prediction are used: Logistic Regression, Random
Forest, and Gradient Boosting.
We perform a hyperparameter tuning on the training snapshots for the
RL techniques based on the link prediction performance. For the random-
walk techniques, the grid search is as follows: p ∈ {0.25, 0.5, 0.75, 1}, q ∈
{0.1, 0.5, 1, 2, 5, 10, 100}. For the graph-autoencoder, the grid search is as fol-
lows: α ∈ {10−6 , 10−5 }, and β ∈ {2, 5}. All experiments, including the data, can
be found in https://github.com/CarlosOrtegaV/dyn-bae.
4 Results
Table 2 and 3 contain, for each classifier and RL technique, the highest Area
Under the Receiver Operating Curve (AUC) scores across hyperparameter com-
binations with its corresponding Average Precision (AP). We can observe that
the extrapolation setting poses a more challenging task since RL techniques have
a lower AUC score than in the interpolation setting; Facebook forum dataset
also obtained lower AUC scores because of the higher number of nodes and
edges. Node2vec consistently outperforms all other RL techniques in both inter-
polation and extrapolation setting. Despite its simpler structure compared to
dyngraph2vecAE (dynae), dynGEM reaches the second place among the RL
techniques. The dynamic Bayesian node2vec (dynbae) scores low compared to
the standard node2vec. Figure 1 and 2 display the variability of the AUC scores
across hyperparameters for two classifiers in the Facebook forum dataset. Inter-
estingly, the graph-autoencoder techniques have higher variability in the extrap-
olation setting compared to the interpolation counterpart.
Table 2. The AUC score & average precision in interpolation setting
Dataset Classifier node2vec DynGEM dynae dynbae

AUC AP AUC AP AUC AP AUC AP
Facebook forum Logistic Reg. 0.851 0.865 0.754 0.728 0.585 0.580 0.672 0.672
Random Forest 0.883 0.891 0.746 0.761 0.604 0.582 0.713 0.667
Gradient Boosting 0.832 0.844 0.724 0.749 0.591 0.612 0.714 0.701
Enron employees Logistic Reg. 0.880 0.890 0.870 0.850 0.615 0.599 0.653 0.662
Random Forest 0.866 0.841 0.823 0.767 0.649 0.602 0.620 0.582
Table 3. The AUC score & AP in extrapolation setting
Dataset Classifier node2vec DynGEM dynae dynbae

AUC AP AUC AP AUC AP AUC AP
Facebook forum Logistic Reg. 0.750 0.804 0.701 0.725 0.618 0.585 0.564 0.545
Random Forest 0.748 0.800 0.500 0.502 0.555 0.540 0.585 0.563
Enron employees Logistic Reg. 0.846 0.860 0.810 0.820 0.582 0.599 0.601 0.593
Random Forest 0.864 0.872 0.657 0.668 0.531 0.543 0.585 0.605
Fig. 1. Extrapolation using Logistic Fig. 2. Interpolation using Random

Regression Forest
5 Conclusions
This work compares orthogonally different techniques for RL in dynamic graphs

regarding link prediction. The comparison consists in two settings that are not
often presented together in the literature. We find that the random-walk tech-
niques, particularly the standard node2vec, outperform the graph-autoencoder
techniques. Furthermore, the Bayesian adaptation of node2vec performs poorly
even though it uses the same similarity measure.
References
1. Bamler, R., Mandt, S.: Dynamic word embeddings. In: Proceedings of the 34th
International Conference on Machine Learning, ICML 2017, pp. 380–389. PMLR
(2017)
2. Barkan, O.: Bayesian neural word embedding. In: Proceedings of the Thirty-First
AAAI Conference on Artificial Intelligence, AAAI 2017, pp. 3135–3143. AAAI
Press (2017)
3. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new
perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
4. Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science
and Statistics. Springer, New York (2006)
5. Bražinskas, A., Havrylov, S., Titov, I.: Embedding words as distributions with a
Bayesian skip-gram model. In: Proceedings of the 27th International Conference
on Computational Linguistics. Association for Computational Linguistics (2018)
6. Cai, H., Zheng, V.W., Chang, K.C.C.: A comprehensive survey of graph embed-
ding: problems, techniques, and applications. IEEE Trans. Knowl. Data Eng. 30(9),
1616–1637 (2018)
7. Dasgupta, K., Singh, R., Viswanathan, B., Chakraborty, D., Mukherjea, S., Nana-
vati, A.A., Joshi, A.: Social ties and their relevance to churn in mobile telecom net-
works. In: Proceedings of the 11th International Conference on Extending Database
Technology: Advances in Database Technology, EDBT 2008, pp. 668–677. ACM,
New York (2008)
8. De Winter, S., Decuypere, T., Mitrović, S., Baesens, B., De Weerdt, J.: Combining
temporal aspects of dynamic networks with Node2Vec for a more efficient dynamic
link prediction. In: 2018 IEEE/ACM International Conference on Advances in
Social Analysis and Mining (ASONAM), pp. 1234–1241. IEEE (2018)
9. Goel, R., Jain, K., Kobyzev, I., Sethi, A., Forsyth, P., Poupart, P.: Relational rep-
resentation learning for dynamic (knowledge) graphs: a survey. arXiv.org (2019).
http://search.proquest.com/docview/2231646581/
10. Goyal, P., Chhetri, S.R., Canedo, A.: dyngraph2vec: Capturing network dynam-
ics using dynamic graph representation learning. Knowl.-Based Syst. 187, 104816
(2020)
11. Goyal, P., Kamra, N., He, X., Liu, Y.: DynGEM: deep embedding method for
dynamic graphs. arXiv preprint arXiv:1805.11273 (2018)
12. Grover, A., Leskovec, J.: Node2Vec: scalable feature learning for networks. In:
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pp. 855–864. ACM (2016)
13. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large
graphs. In: Advances in Neural Information Processing Systems, pp. 1024–1034
(2017)
14. Hamilton, W.L., Ying, R., Leskovec, J.: Representation learning on graphs: meth-
ods and applications. arXiv preprint arXiv:1709.05584 (2017)
15. Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic
Eng. 82(1), 35–45 (1960)
16. Kipf, T., Welling, M.: Variational graph auto-encoders (2016). arXiv.org
17. Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks.
J. Am. Soc. Inform. Sci. Technol. 58(7), 1019–1031 (2007)
18. Ma, X., Sun, P., Wang, Y.: Graph regularized nonnegative matrix factorization for
temporal link prediction in dynamic networks. Phys. A 496, 121–136 (2018)
19. Mahdavi, S., Khoshraftar, S., An, A.: dynnode2vec: scalable dynamic network
embedding. In: 2018 IEEE International Conference on Big Data (Big Data), pp.
3762–3765. IEEE (2018)
20. Mitrović, S., Baesens, B., Lemahieu, W., Weerdt, J.D.: tcc2vec: RFM-informed
representation learning on call graphs for churn prediction. Inf. Sci. (2019)
21. Nguyen, G.H., Lee, J.B., Rossi, R.A., Ahmed, N.K., Koh, E., Kim, S.: Continuous-
time dynamic network embeddings. In: Companion Proceedings of the The Web
Conference 2018, WWW 2018, pp. 969–976. International World Wide Web Con-
ferences Steering Committee (2018)
22. Perozzi, B., Al-Rfou, R., Skiena, S.: DeepWalk: online learning of social represen-
tations. In: Proceedings of the 20th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, pp. 701–710. ACM (2014)
23. Rossi, R.A., Ahmed, N.K.: The network data repository with interactive graph
analytics and visualization. In: AAAI (2015). URL http://networkrepository.com
24. Singer, U., Guy, I., Radinsky, K.: Node embedding over temporal graphs. arXiv
preprint arXiv:1903.08889 (2019)
25. Trivedi, R., Farajtabar, M., Biswal, P., Zha, H.: Representation learning over
dynamic graphs. arXiv preprint arXiv:1803.04051 (2018)
26. Troncoso, F., Weber, R.: A novel approach to detect associations in criminal net-
works. Decis. Support Syst. 128, 113–159 (2019)
27. Van Belle, R., Mitrović, S., De Weerdt, J.: Representation learning in graphs for
credit card fraud detection. In: ECML PKDD 2019 Workshops. Springer (2019)
28. Wang, D., Cui, P., Zhu, W.: Structural deep network embedding. In: Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, KDD 2016, 13–17 August 2016, pp. 1225–1234. ACM (2016)
29. Wang, Y., Yao, Y.: A brief review of network embedding. Big Data Min. Anal.
2(1), 35–47 (2019)
30. Yang, Y., Ren, X., Wu, F., Zhuang, Y.: Dynamic network embedding by modeling
triadic closure process. In: Thirty-Second AAAI Conference On Artificial Intelli-
gence, pp. 571–578. AAAI (2018)
Metadata Action Network Model for Cloud
Based Development Environment
Mehmet N. Aydin1(&) , Ziya N. Perdahci2 , I. Safak1,

and J. (Jos) van Hillegersberg3
1
Kadir Has University, 34083 Istanbul, Turkey
[email protected]
2
Mimar Sinan Fine Arts University, 34083 Istanbul, Turkey
[email protected]
3
University of Twente, 34083 Enschede, The Netherlands
[email protected]
Abstract. Cloud-based software development solutions (entitled as Platform-

as-a-Service, Low-Code platforms) have been promoted as a game changing
paradigm backed by model-driven architecture and supported by various cloud-
based services. With the engagement of a sheer number of platform users (ex-
perienced, novel, or citizen developers) these platforms generate invaluable data
and that can be considered as user metadata actions. As cloud-based develop-
ment solutions provide novice users with a new development experience (per-
forming data actions that altogether leads to a successful software app), users
often times face with uncertainty about development performance; how good or
complete is app development? Thus, the issue addressed in this research is how
to measure user performance by using digital trace data generated on the cloud
platform from a Network Science perspective. This research proposes a novel
approach to leveraging digital trace data on Platform-as-a-Service (PaaS) from a
Network Science perspective. The proposed approach considers the importance
of digital trace data as metadata actions on PaaS and introduces a network model
(so-called Metadata Action Network), which is claimed to be the result of
reconstruction of events of developer’s actions. We show suitability of the
proposed approach to better understanding of real-world digital trace data on
PaaS solution and elaborate basic performance analytics on a PaaS solution with
research and practical implications.
Keywords: Network science Development performance Digital trace data

PaaS Analytics Cloud computing
1 Introduction
Platform as a Service (PaaS) has been promoted as a panacea for the long-standing
software development problem of delivering successful solutions with high performed
or collaborative actors including developers, users. The premise behind PaaS is that it
can foster user performance (novice developers) in terms of delivering better, faster,
cheaper, and high-quality enterprise software solutions. Global enterprise solutions
providers such as SalesForce, Mendix have adopted PaaS solutions and strived for
532 M. N. Aydin et al.
engaging greater developers’ communities in contributing to cloud-based enterprise

systems marketplaces. The research perspectives rooted in Information Systems and
Software Engineering have made significant contributions to better understanding of
PaaS [1, 2], but there a little attempt to put emphasis on PaaS from digital trace data
point of view [3]. That is, with the engagement of sheer number of platforms users
(experienced, novel, or citizen developers) these platforms generate invaluable data and
that can be used to improve platform performance [4].
The very idea of PaaS has come to into play with the cloud-driven paradigm shift
that has evolved over last two decades. Infrastructure, Platform and Service layers as
three generic models are adopted by cloud-driven platform-based solutions [3]. This
research focuses on PaaS which acts as a bridge between IaaS and SaaS – it is the
runtime environment, the middleware of the cloud service architecture [6]. Metadata
Application PaaS (aPaaS) is a particular type of PaaS, which provides cloud-based
integrated development environment with a metamodel [9]. Simply this type (metadata
aPaaS) embrace the very idea of model-driven software development [7] that employs a
number of essential principles such as abstraction, model transformation and refine-
ment, reusability [8]. Simply, the idea has existed for more than twenty years, but its
realization is about to begin with global success stories such as Mendix, OrangeScape,
ServiceNow For instance, German-based Siemens acquired Dutch software manufac-
turer Mendix for € 628 million ($730 million) in June 2019, which reportedly makes it
one of the largest acquisition of a Rotterdam-based company.
In traditional software development, development efforts and artefacts are struc-
tured (stages, phases, models, other artefacts depending on development methodolo-
gies) and are subject to certain quality of conduct, so actors involved have some
feedback about development performance. As cloud-based development solutions
provide novice users with a new development experience (create data actions that
altogether leading to a successful software app), users face with uncertainty about
development performance; how good or complete is development work underway?
Thus, the issue addressed in this research is as follows: how can one measure per-
formance of cloud-based application development by using digital trace data generated
on the cloud platform? Can Network Science help in leveraging digital trace data as a
network model and in turn provide insights in user performance on cloud-based
development? This research proposes a novel approach to leveraging data on aPaaS
from a Network Science perspective, which is acclaimed as the Science of the 21st
Century [5]. The proposed approach considers the importance of digital trace data as
metadata actions on aPaaS and introduces a network model which is claimed to be the
result of appropriate events reconstruction on a platform. We show viability of the
proposed approach with the real-world digital trace data on aPaaS solution and elab-
orate network analytics on aPaaS solution with research and practical implications.
Scholars including [9] have pointed out underutilized aspect of these platforms,
which is about creation and possibly use of digital trace data (DTD) on aPaaS. It should be
noted that the data on aPaaS deserve to be called digital trace if three criteria are satisfied:
(i) it is found (not generated for research),
(ii) it is event based (event reconstruction possible),
(iii) and longitudinal (time stamp for tracing).
Metadata Action Network Model for Cloud Based Development Environment 533
In this regard, validation of digital trace data is of particular importance to

researchers if digital trace data is accepted as a credible data asset for generating
valuable analytics. In this research to ensure validation of digital trace data for our
examination we perform all necessary steps defined in [10]. Scholars in various
research domains including online social media, information systems have already
shown considerable interests in DTD and used terms like data analytics, business
intelligence by using different approaches. We contend that the very idea of DTD
surfaces recording of the relations of constituting elements of all relevant events. That
is, it is about things and their relations forming a network. Even though the contexts for
things and their relations are different so the networks are labeled differently, a number
of common distinguishing network characteristics have been at the center of attention.
This calls for a Network Science approach as a novel way of examining digital trace
data as a complex system [11]. In the last two decades, there has been a significant
interest in better understanding of real-world entities and their relations as complex
systems. With 20th anniversary of Network Science [5], scholars have investigated the
very nature of these complex systems in various contexts such as social, technological,
health sciences, and political sciences [12, 13]. In the following we shall discuss how
user actions, so-called metadata actions, generating digital trace data are considered as
a network data. The basic trust of this research is that Network Science Approach to
aPaaS can significantly contribute to better understanding of user actions in model-
driven development and to progress of disruptive aPaaS technologies. Thus, this
research is aimed to discover graph models which can describe best the digital trace
data. Then the research question is to decide on the type of network representation of
metadata action patterns (i.e., creating or connecting software artefacts). We have three
important choices to make: the choice of nodes, the choice of edges, and the choice of
metadata on the nodes and edges. The discovered graph model will be a significant
contribution to the reference research domains (Software Engineering and Information
Systems) and the cloud computing ecosystem.
2 Background
aPaaS in this context can be defined as a “complete application platform as a service”

that offers independent software vendors (ISVs) as well as individuals the means to
produce “multitenant SaaS solutions or various application integration solutions” in a
fast and efficient way [6].
The critical role of Information Communications Technology (ICT) along with
ubiquitous technologies and innovative software applications on digital transformation
of nations, society, companies, and individuals cannot be denied in recent years.
Apple’s motto “Everyone Can Code” at the beginning of 2018 is worth noticing in that
it shows that “everyone”, rather than being just an end-user, can be involved in ICT
development as a developer. Essentially, this discourse is a sign of manifesto for a need
of paradigm shift from a conventional thinking and practice to a novel approach and
cloud technologies (aPaaS, specific cloud computing layer) for ICT development, in
particular software application development.
The history of research on software development methods, tools, technologies is

rich and can go back to the origin of ICT. From 1970 to mid 1990s that we call a
conventional paradigm in software development, the research endeavors and software
development practice had witnessed a sheer number of ideas, methodologies, tools that
failed to fulfill the promise of better, cheaper, high quality and fitness for purpose
software development. In mid 1990s we saw a new era along with new ideas such as
method engineering, model-driven, agile software development were proposed to
overcome limitations of monolithic tools, jungle of heavy methodologies, constraining
technologies leading to frustrating project results. More modern technologies coined as
computer-aided software engineering (CASE) have been promoted to support this
evolution [7]. The next era started in ICT development with the help of ubiquitous
technologies (smart, integrated devices and cloud computing technologies) and inno-
vative software applications such as social network application, mobile applications for
end-users. Noticeably, compared to European companies, North American players such
as Apple, Google, Facebook, Salesforce, WhatsApp have seized the opportunities of
paradigm shift in software development for end-user applications at a global scale.
With the notion of cloud computing this paradigm has made a significant but
inadequate leap for software applications for enterprises including small-medium ones.
A growing number of enterprises are facing up to digital reality and yet to fully harness
the power of cloud computing as a disruptive technology. The attempts made by global
tech companies (IBM and Apple, Oracle, SAP) to enable SMEs for digital transfor-
mation have made limited progress compare cloud-born niche players providing
software as a service (SaaS) (e.g., Salesforce) and platform as a service (PaaS) (e.g.,
Mendix, OrangeScape) solutions. aPaaS indeed opens up a new ecosystem where the
three key stakeholders (independent software vendors-ISV), platform providers, and
customers, especially SMEs have an opt for a new way of developing, managing, using
software applications. A novel way of developing software for enterprises bring on
disruptive changes on the developer side that one can use a motto “everyone can
develop app for enterprises”. In fact, these platforms provide higher abstraction with
modeling actions instead of code development, making it possible for business logic
and user to develop software products instead of code. More precisely, without any
code writing knowledge, business and user requirements can be transformed into
software products in a unique way. This process can all be recorded digitally.
We contend that aPaaS is the signature of a paradigm shift in software develop-
ment. Simply it embraces the very idea of model-driven software development that
employs a number of essential principles such as abstraction, model transformation and
refinement, reusability. That is, the idea has existed for more than 20 years, but its
realization is about to begin!
Although these game-changing technology companies have made significant pro-
gress in terms of software paradigm shift, recent research and field experience show
that an extent to which the aforementioned principles are not fully harnessed in existing
platforms. aPaaS platforms are evolving as research and practice goes hand-in-hand.
On the research side, analyzing of these platforms in terms of key principles is available
in market intelligence report or scientific papers that remain non-empirical, method,
model artefacts. On the industry side, technologies, techniques and features on the
platforms are evolving as more experimental and real-life platform use experience
driven studies carried out.
3 Network Construction from Digital Trace Data
One of the challenges with DTD is to validate found digital trace data. The researchers
and the company worked together to fix several issues with the digital trace resulted
from user actions. For instance, creating a Plain Menu Item produces:
82494,[email protected],bb1ecc8c-9473-4322-8fb2-
221a6ea2d41c,CREATE_NEW_MENU,null,null,null,null,2018-11-16
10:06:09.0,null,null,null,null,null,null,null,null,null,null,null,null,Mars,null,null,null
82495,[email protected],8cdadf96-e81a-4c8e-87c8-
d1399f4aede2,ADD_TRANSIENT_ENTITY_TO_MENU,null,null,null,null,2018-
11-16
10:06:09.0,null,null,null,null,null,null,null,null,null,null,null,null,Mars,null,null,null

aPaaS platform records both the digital trace data of user actions as metadata
actions that occur at particular points in time as longitudinal data and attributes of the
software artifacts some of the actions, such as the kind of abstract data type the action
creates. One converts digital trace data into a network by determining what corresponds
to a node and an edge, as well as what corresponds to node/edge attribute. We consider
that there is only one edge type. Edges represent binding between two software arti-
facts. On the other hand, we classify nodes as Model (M), View (V), or Controller
(C) (see the Appendix). The node classifications are incorporated into the network
model as node attributes. Figure 1 depicts an overall structure of the services provided
to users and furthermore shows that the generated digital trace data that can be mapped
to architectural views, which is Model-View-Controller (MVC). MVC is a useful
pattern for separations of concern in software engineering. It helps developers to
partition the application as early as the design phase and especially is applied to web-
based applications.
Fig. 1. Overall structure of the services provided on Metadata aPaaS.

aPaaS provides the developer with a cloud-based integrated development envi-

ronment (Cloud-IDE) that fosters rapid application development via a glossary and
services (Fig. 2a and Fig. 2b). To each term in the glossary or service there corre-
sponds a metadata action that compiles a software artifact (executable code). Similarly,
to each service there corresponds a built-in function to perform a specific task.
Moreover, developers are able to add their own services to enhance the functionality of
the via a simple scripting language (e.g. MVEL). Some of the metadata actions provide
the developer with creating an artifact, while others enable the developer to link the
artifacts.
As a specific example, imagine that the developer is given the task to develop a
simple post form that contains several text fields and buttons. One button linked to a
service allows the user to retrieve their own contact information from the server, one
button allows the user, for example, to upload an image, and a text box is provided to
type a post they want to send. Clicking on another button sends the post data to a
server. If we think about the metadata actions involved in developing the form, the
developer may invoke a few actions or around a dozen actions at most, as well as a few
functions, and there are hundreds of metadata actions and services to choose from on a
typical aPaaS. This means that almost every metadata action that the developer could
potentially select to create an artifact and/or link artifacts the developer does not select.
Or, almost every possible link between the artifacts is non-link, it is not registered in
the digital trace data; so, the data collected by aPaaS platforms in this respect is sparse;
as opposed to dense data in a world where every software artifact is linked with each
other.
Fig. 2. a. A simple form generated on aPaaS. b. User interface on aPaaS
It is this sparseness [18] that calls for Network Science approach to analysis of
metadata actions on aPaaS. Many systems can be regarded as networks, sets of things
and their interactions. In a graphic representation of a network, nodes (or vertices) are
the things of interest, and interacting nodes are joined in pairs by arcs (or links).
A network is a mathematical object specifically designed to represent sparse data, and
network science is concerned with analyzing and modeling such systems. Figure 3
depicts an overall network construction process from the raw data creation (users’
actions) to network visualization and analytics.
Fig. 3. Network construction: from digital trace data to network analytics
We can regard the sparse data of recorded metadata actions on the aPaaS data base
as a network. To emphasize this, from now on, we will refer to the network produced
by the developers who develop applications as the “metadata action network” (MAN).
In MAN the objects of interest are the software artifacts created by metadata actions,
and the artifacts are joined in pairs by arcs if the developer opts to link them.
4 Demonstration of Network Analytics on aPaaS
The digital trace data under examination is found on imona.com, which is an appli-
cation development platform (Application Platform as Service: aPaaS) where devel-
opers can not only create new applications, but it also offers the possibility of extending
the functionality of any application already placed in its marketplace [4]. Imona.com is
a one type of aPaaS, called metadata aPaaS [4]. Metadata aPaaS provides visual tools to
customize data models, application logic, workflow, and user interface. The underlying
metadata model for this aPaaS is essential to this research as it provides us a meta-
model [9] to reflect on network models of digital trace data to be discussed later on.
The dataset includes the list of metadata actions that were created by developing a
tutorial app by seven developers. This tutorial app is chosen as the name suggests it is
used for a training purpose, simple enough to monitor all development activities and all
steps with visual guidance in a 42-minutes video are provided. It should be noted that
even the tutorial video provides a clear guidance on how to develop app, there is no
single pathway to follow while developing the app. This gives us an opportunity to
compare development activities from the proposed network model. Another interesting
point is that in advance only a verbal brief on where and what to develop is given and
no any conceptual support (such as use cases or any documents) is provided during the
development activities. The application is based on three conceptual entities: User,
Post, Comment. It is similar to typical online user-content sharing apps where a new
user is to be created so that the user can create a post and comment on a post.
Additionally, a user can search posts or comments. This app essentially consists of
three distinct pages, which is referred as transient entity in the metadata action
description.
While presenting MAN, we use the tutorial app and refer to MVC to describe
metadata actions. We distinguish metadata actions that create nodes, actions that create
edges between nodes, and furthermore actions that create both nodes and edges. For
each node we use an MVC label as metadata. That means, the node can be model or
view or controller. An application in general is composed of several screens. For
example, a developer can add a new screen to an application by using a given service
(in our case, it is using “an add button with a label of “add””). This act invokes the
metadata “CREATE_ ENTITY”, which we model as a creation of a new network node,
type M(Model). Instantiation of this metadata can be “user”. Another example would
be “ADD_TERM_TO_ENTITY”, which we model as a creation of a new network
edge. Instantiation of this metadata can be a form field, such as “name” of the “user”
entity. One can find descriptions of the rest of all metadata actions as a MAN Catalog in
the Appendix.
5 Discussion and Conclusion
Table 1 summarizes basic network statistics of the cases under investigation. The
research question was to decide on what should constitute elements of a graph (what is
node and what is edge) and graph representation (directed or undirected, multigraph,
weighted, bipartite) itself. We have demonstrated that MAN proposed is viable to
model applications developed on an aPaaS environment with promising outcomes.
Regarding viability of the proposed network, MAN of an application is able to
reveal the underlying MVC architectural paradigm [15]. That is, three network layers
(colored as black, white, and grey) we observed are in accordance with what the
underlying architecture of the platform provides (Fig. 4). The outer most layer has to be
the View because this is what an end user interacts with. The middle layer corresponds
to the Controller because this is the layer that bridges the gap between the Model and
the View layers. The inner most layer indicates the Model where the data reside.
C1 C2 C3
C4 C5 C6
C7
Fig. 4. MAN of the application developed by each group. For graph layout, force Atlas2 is used
in Gephi [14]
Table 1. Comparison of different cases based on connected components and network diameter.
Cases Network measures Status of release candidate
#of CC Diameter Radius
C1 1 8 5 Beta
C2 1 8 5 Beta
C3 1 11 6 Alpha
C4 1 7 4 Alpha
C5 4 4 0 Premature
C6 1 9 5 Release candidate
C7 1 9 5 Release candidate
Regarding promising outcomes, one of the challenges the platform owner would
face is to provide developers with a revision control system (RCS) [16]. Although it
may not be possible to provide RCS in the same way as traditional IDE we contend that
MAN could be employed to provide a new kind of RCS. In conventional software
development, the user can visually check whether all software components are
integrated whereas on an aPaaS this is not the case. The MAN graph consisting of a
single connected component indicates that all software artefacts are interconnected. We
suggest that the total number of components can be used as a revision control system, if
it is more than one it means that some artefacts are still not interconnected (so-called
premature, Alpha or Beta) and the user should make additional metadata actions to
complete a version. If there is only one component and if other analytics results are
fulfilled, then the application developed may deserve to be a Release Candidate.
C1 C2 C3
C7
C4 C5 C6
Fig. 5. Degree distribution for all cases
Yet another promising outcome is that basic network statistics such as degree
distributions, network radius, and network diameter can be related to aPaaS user
performance analytics. Figure 5 depicts degree distributions of each case. Visually one
can see that the degree distributions of Case 7, Case 6 and to some extent Case 1 and
Case 3 exhibit an approximate straight line on doubly logarithmic scales which is
typical of real-world networks, whereas the degree distribution of Case 5 is clearly
distinct from a power-law [19]. MAN for MVC architectural paradigm constraints the
Diameter of the developed application to a certain size, which we believe is the number
of screens developed for an aPaaS application multiplied by the software architecture
layers (MVC), which is 3. So, for example, for the case at hand it has to be 9. When we
compare 7 cases, two of them satisfy this result, which would be another indicator for
them to be Release Candidates. The Radius provides us with another salient analytics,
which indicates whether MVC layers are interconnected with shorter pathways. The
middle layer of the software architecture (the Controller layer) and the other two layers
it bridges, namely the View and the Model layers should be equally spaced. In line with
this argument, we suggest yet another formula: The Radius of MAN for a Release
Candidate has to be approximately half the Diameter. So, for the app examined, the
Radius should be approximately equal to the number of screens developed, which is
three, divided by two (half the Diameter), which is 4.5. The observed value of the
Diameters for Release Candidates and Beta conform to this formula, which are 5.
Further exploratory network analysis is done to identify structure of a network

where one can ask the key question: What structural shape does the metadata action
network take? The next step would be to search for common patterns of network
correlations for MAN of different applications where one can ask the key question: Is
there a typical correlation pattern the metadata action networks of developed aPaaS
exhibit? Promising correlation metrics to be examined would be as follows: degree-to-
degree correlation, node attribute-to-node attribute correlation [20]. Specifically, it
looks promising to search for a pattern of assortative mixing by MVC. More ambitious
analysis is going beyond structural and correlation pattern analysis is to predict a
network structure based on a statistical network model. Essentially this is about
answering the why question by fitting a mechanism to observed network data.
It is vital for the platform provider to use the metamodel and the meta language that
are at the heart of the platform. These analyses of the digital trace data help the platform
provider understand how fully and correctly the software developer performs. For
example, the magnitude of network correlation may be a critical criterion for
demonstrating the level of reusability (in-app and across-app) of application software.
From the developer perspective [17], network metrics can be used that provide
understandability, the version of the software, and the extent to which the software has
been completed. How much has the software completed in the first version, and what is
the situation in terms of upcoming releases from the network point of view and how
can the software evolve?
Appendix
METADATA ACTIONS NETWORK CONSTRUCTION

NodeCreation Edge Creation NodeType
ADD_ENTITY_TO_FIELD x
ADD_ENTITY_TO_MENU x
ADD_ENTITY_TO_SCRIPT x
ADD_ENTITY_TO_LIBRARY x x C
ADD_GLOBAL_FUNCTION_TO_SCRIPT x
ADD_ITEM_TO_LIST x
ADD_LIBRARY_TO_SCRIPT x x C
ADD_LIST_TO_FIELD x
ADD_PRIMARY_KEY_TO_ENTITY x
ADD_SCRIPT_TO_MENU x
ADD_SCRIPT_TO_TRANSIENT_FIELD x
ADD_SCRIPT_TO_REST x x C
ADD_SUBMENU_TO_MENU x
ADD_TERM_TO_ENTITY x
ADD_TERM_TO_FIELD x
ADD_TRANSIENT_ENTITY_TO_MENU x
ADD_TRANSIENT_ENTITY_TO_SCRIPT x
(continued)
(continued)
METADATA ACTIONS NETWORK CONSTRUCTION
NodeCreation Edge Creation NodeType
ADD_TRANSIENT_FIELD_TO_TRANSIENT_ENTITY x
CREATE_ECONTAINER_COMPONENT x M
CREATE_ENTITY x M
CREATE_LIST x M
CREATE_LIST_ITEM x M
CREATE_NEW_GLOBAL_FUNCTIOIN x C
CREATE_NEW_MENU x V
CREATE_NEW_SUB_MENU x V
CREATE_REST_SERVICE x C
CREATE_SCRIPT x C
CREATE_TERM x M
CREATE_TRANSIENT_ENTITY x V
References
1. Armbrust, M., Fox, A., Griffith, R., et al.: A view of cloud computing. Commun. ACM 53
(4), 50–58 (2010)
2. Beimborn, D., Miletzki, T., Wenzel, S.: Platform as a service (PaaS). Bus. Inf. Syst. Eng. 3
(6), 381–384 (2011)
3. Teixeira, C., Pinto, J.S., Azevedo, R., et al.: The building blocks of a PaaS. J. Netw. Syst.
Manag. 22(1), 75–99 (2014)
4. Aydin, M.N., Perdahci, N.Z., Odevci, B.: Cloud-based development environments: PaaS. In:
Encyclopedia of Cloud Computing, p. 62 (2016)
5. Vespignani, A.: Twenty years of network science. Nature 558, 528 (2018)
6. Bezemer, C.P., Zaidman, A., Platzbeecker, B., et al.: Enabling multi-tenancy: an industrial
experience report. In: Proceedings of the 2010 IEEE International Conference on Software
Maintenance, September 2010, pp. 1–8. IEEE (2010)
7. Premkumar, G., Potter, M.: Adoption of computer aided software engineering (CASE)
technology: an innovation adoption perspective. ACM SIGMIS Database: DATABASE
Adv. Inf. Syst. 26(2–3), 105–124 (1995)
8. Henkel, M., Stirna, J.: Pondering on the key functionality of model driven development
tools: the case of mendix. In: International Conference on Business Informatics Research.
Springer, Heidelberg (2010)
9. Aydin, M.N., Kariniauskaite, D., Perdahci, N.Z.: Validity issues of digital trace data for
platform as a service: a network science perspective. In: World Conference on Information
Systems and Technologies, pp. 654–664. Springer, Cham (2018)
10. Howison, J., Wiggins, A., Crowston, K.: Validity issues in the use of social network analysis
with digital trace data. J. Assoc. Inf. Syst. 12(12), 767 (2011)
11. Barabási, A.L.: Network Science. Cambridge University Press, Cambridge (2016)
12. Borgatti, S.P., Mehra, A., Brass, D.J., Labianca, G.: Network analysis in the social sciences.
Science 323(5916), 892–895 (2009)
13. Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.U.: Complex networks:
structure and dynamics. Phys. Rep. 424(4), 175–308 (2006)
14. Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and
manipulating networks. In: The Proceedings of the Third International ICWSM Conference
ICWSM, San Jose, California, pp. 361–362. AAAI Press, Menlo Park (2009)
15. Leff, A., Rayfield, J.T.: Web-application development using the model/view/controller
design pattern. In: Proceedings of the Fifth IEEE International Enterprise Distributed Object
Computing Conference, pp. 118–127. IEEE, September 2001
16. Karsai, G., Sztipanovits, J., Ledeczi, A., Bapty, T.: Model-integrated development of
embedded software. Proc. IEEE 91(1), 145–164 (2003)
17. Giessmann, A., Stanoevska-Slabeva, K.: What are developers’ preferences on platform as a
service? An empirical investigation. In: Forty-Sixth Hawaii International Conference on
System Sciences, January 2013, pp. 1035–1044. IEEE (2013)
18. Demaine, E.D., Reidl, F., Rossmanith, P., Villaamil, F.S., Sikdar, S., Sullivan, B.D.:
Structural sparsity of complex networks: Bounded expansion in random models and real-
world graphs. J. Comput. Syst. Sci. 105, 199–241 (2019)
19. Clauset, A., Shalizi, C.R., Newman, M.E.: Power-law distributions in empirical data. SIAM
Rev. 51(4), 661–703 (2009)
20. Newman, M.E.: Assortative mixing in networks. Phys. Rev. Lett. 89(20), 208701 (2002)
Clustering Foursquare Mobility Networks
to Explore Urban Spaces
Olivera Novović1(B) , Nastasija Grujić1 , Sanja Brdar1 , Miro Govedarica2 ,

and Vladimir Crnojević1
1
Institute BioSense, University of Novi Sad, Novi Sad, Serbia
{novovic,n.grujic,sanja.brdar,crnojevic}@biosense.rs
2
Faculty of Technical Sciences, University of Novi Sad, Novi Sad, Serbia
[email protected]
Abstract. Our study aimed to explore Foursquare mobility networks

and investigate phenomena of clustering venues across the cities. We
performed graph-based clustering to detect venues that highly interact
among each other in terms of aggregated users mobility flows. Available
Foursquare data included check-in information for ten large worldwide
cities, observed in the period of two years, each having large number of
geo-tagged venues coupled with semantic information in form of venue
category. Such data allowed us to study cities as complex systems and
explore their dynamic nature. We obtain global overview on the seman-
tics content of clusters derived from venues categories, quantified changes
in the clusters on a monthly bases and compared results between cities.
Keywords: Graph-based clustering · Mobility networks · Foursquare
1 Introduction
Location data became ubiquitous due to global adoption of smartphones, the
worldwide availability of the GPS and advanced location-based applications. The
value of such data is immense as they contain spatial-temporal patterns of mas-
sive number of people. Among popular location-based applications is Foursquare,
a social network application founded in 2009. In the application, users are able
to notify their friends about their current location through check-ins for which
they can receive virtual rewards. Apart from that, it allows users to a leave note
about their experience in the specific venue, which can be utilized for building a
recommendation system. With its initiatives to open some of data they collect,
Foursquare attracted researches to explore their rich source of information and
evaluate its potential for understanding social behaviour, mobility, and propose
location intelligence services.
Many research efforts were dedicated to the analysis of Foursquare data and
interesting patterns were discovered. Preoţiuc-Pietro applied k-means clustering
on users and used the result for the prediction of user future movements [12].
Clustering Foursquare Mobility Networks to Explore Urban Spaces 545
Joseph et al. clustered users via topic modeling, an approach that is usually
used in the classification of text documents according to the latent themes [7].
On the opposite, Cranshaw et al. performed a clustering algorithm on venues
regarding the spatial and social characteristics of venues [3]. Jun Pang et al.
applied algorithms PageRank and HITS on Foursquare data for the purpose of
performing friendship prediction and location recommendation [11]. D’Silva et
al. used Foursquare data and machine learning in order to predict the crime [5]
and Noulas et al. used machine learning on the data with the aim of predicting
the next venue that user will visit [10]. Yang et al. explored the tourist-functional
relations between different POI types present in Foursquare data in the city of
Barcelona [15]. Moreover, researchers made a comparison of Foursquare data
and data from additional location-based services (LBS) in order to check the
similarity of patterns, the validity of check-ins, etc [13,16]. Foursquare data were
also utilized to characterize competition between new and existing venues [4] by
measuring change in throughput of a venue before and after the opening of a
new nearby venue.
Our study is part of wider initiative - the Future Cities Challenge - launched
by Foursquare that provided data to selected participants. Following sections
describe data, our research questions, methods and obtained results.
2 Data and Problem Description
Future Cities Challenge included two types of data from Foursquare for ten
cities (Chicago, Istanbul, Jakarta, London, Los Angeles, New York, Paris, Seoul,
Singapore and Tokyo) provided in textual format. The first type of data provides
‘venue information’ where each venue is described with id, name, coordinates and
venue category in each line. The second type contains ‘movements’ information
where each line corresponds to an edge between a pair of venues, the month and
year that movements were aggregated for the given venue pair and the period
of the day1 . The last number in the line represent the “weight” which reflects
the number of check-ins that took place by any user for the given venue pair.
Number of venues differ for each city, where the city with highest number of
venues is Istanbul, followed by Tokyo and New York (Fig. 1).
Anonymized and aggregated location visit data provide opportunity to study
cities as complex systems. In this study we explored how venues are clustered
based on mobility flows, what is the semantic content of clusters and how cities
compare to each other in terms of semantic content of detected clusters. All of
those topics are relevant for building recommendation system for user coming
in the city and matching his preferences to the group of venues that can jointly
offer content.
1
Overnight (between 00:00:00 and 05:59:59), morning (between 06:00:00 and
09:59:59), midday (between 10:00:00 and 14:59:59), afternoon (between 15:00:00 and
18:59:59), and night (between 19:00:00 and 23:59:59).
546 O. Novović et al.
Fig. 1. Number of venues per city
3 Methods and Results
In this work, we focus on clustering venues based on the movements across the
city to quantify venues grouping through time and thus inspect urban dynamics.
Due to the size and complexity of the input data we decided to use Apache
Spark platform for distributed processing to perform graph clustering analysis.
Apache Spark is a unified distributed engine with a rich and powerful API for
Scala, Python, Java and R [8]. Graphs are made from input Foursquare data
on monthly basis, where venues represent nodes in the graph and aggregated
movements between two consecutive venues represent edges. If the movement
occurred more than once during different days or day time periods, the weights
are aggregated, so in the final graph we have unique edges over one month. To
cluster movements across the city we used Louvain algorithm [2] which proved
to be very efficient when working with large, complex graphs [14].
3.1 Clustering Venues Based on Users’ Movements Between Venues

Network clustering is one of the most popular topics of modern network science.
Due to its popularity and applicability to wide range of real world problems,
clustering algorithms became a specialized subclass of machine learning family
of algorithms. Clusters inside graph are usually groups of vertices having higher
probability of being connected to each other than to members of other groups,
though other patterns are possible.
With knowing cluster structure we gain better insight on network organiza-
tion and moreover on information flow through the network. Many real networks
are highly dynamic and understanding the rules that govern information flow is
a crucial step toward establishing a theory of network dynamics [6]. Having the
better insight into cluster structure allows us to focus on regions that have some
degree of autonomy within the graph. It helps to classify the vertices, based
on their role with respect to the cluster they belong to. For instance, we can
distinguish vertices totally embedded within their clusters from vertices at the
boundary of the clusters, which may act as brokers between the modules and,
in that case, could play a major role both in holding the modules together and
in the dynamics of spreading processes across the network. In the context of
location-based social networks detecting a cluster means detecting a group of
venues that are frequently visited together by users. Detecting such places could
give us better insight in urban dynamics and evolution of the cities.
Graphs made from Foursquare movements data are massive and prone to
dynamical evolution, since the structure of location-based social network is
changing very fast. When choosing optimal algorithm to perform clustering over
movements graphs we need to focus on two major issues: i) the algorithmic tech-
niques applied must scale well with respect to the size of the data, which means
that the algorithmic complexity should stay below O(n2 ) (where n is the number
of graph nodes), and ii) the number of clusters is unknown in advance, the algo-
rithms used must be flexible enough to be able to infer the number of clusters
during the course of the algorithm. To meet the aforementioned requirements,
the authors proposed in [14] to apply a modularity-based algorithm described
in [2]. This algorithm is based on the concept of modularity [9] presented by
Eq. 1 where A ij is the weight of the edge connecting the i-th and the j-th node
of the graph, j Aij is the sum of the weights of the edges attached to the i-th
node, ci is the cluster where the i-th node is assigned to, m = (1/2) i,j Aij , and
δ(x, y) is zero if nodes x and y are assigned to the same cluster and 1 otherwise.

1 j Aij · i Aji
Q= Ai,j − δ(ci , cj ) (1)
2m i,j 2m
Clustering in complex networks is N P-hard problem. To keep the processing

performance high it is common to use approximation algorithms. When optimiz-
ing the solution the cost function is defined with the aim to find maximum or
minimum value of given function. The algorithm proposed in [2] uses the mod-
ularity as a cost function, and to provide efficient solution it applies iterative
process that involves shrinking the graph every time the modularity converges.
The algorithm begins with assigning each node in the graph to its own clus-
ter and starts moving the nodes to neighbouring clusters that maximizes the
modularity of the graph. The process is executed as long as nodes are moving
and modularity grows. When the modularity converges and there are no more
changes the graph shrinking process is executed. When the shrinking is executed
each cluster is assigned to the one super node of the new graph and the same
technique is applied again to the new graph. The algorithm terminates when
the modularity reaches its local maximum and the set of clusters from previous
phase that maximizes modularity is returned as a result.
The example of clustering for city of Chicago, 2017-04 is presented in Fig. 2.
From Fig. 2 we can notice how venues tend to cluster in spatial proximity. Fre-
quent movements are occurring between places that are spatially close. Similar
behaviour is observed in clustering of telecom traffic networks [14]. Formed clus-
ters differ in size related to the number of containing venues and their spatial
distribution. It is observed that in a closer proximity to the city center clusters
are formed with smaller number of venues that are densely grouped together,
compared to the peripherally located clusters which have many venues widely
distributed in space.
Fig. 2. Clusters of venues for city of Chicago, 2017-04
3.2 Exploring Semantics of the Clustered Venues Based on Their

Categories
To gain better understanding of the cluster structure inferred from movements

data we investigated categories of the venues inside each cluster. We mapped
provided subcategories onto nine super categories described at Foursquare web
site [1]. Categories are: Art & Entertainment, College & University, Event, Food,
Nightlife Spot, Outdoors & Recreation, Professional & Other, Shop & Service,
Travel & Transport, Other. Category Other is used for those venues that do not
fit in any of the main categories.
We calculated percentage of each category present in the cluster. Each city
has unique digital footprint of the categories that are dominant across clusters.
Some cities have similar patterns, while for others significantly different clusters
emerged.
4 Results
Clustering is performed over graphs made from movements data, for each city, for
each month between 2017-04 and 2019-03. The results obtained from clustering
show high diversity between cities, and even between months for the same city.
To get a better insight in the variability of clusters across the cities, we presented
the scatter plot in Fig. 3. The plot shows the dependency between the average
number of clusters and their size in each city per month. From the Fig. 3 we can
Fig. 3. Variation of clusters
notice the cluster variation by months within one city, as well as the variation
between cities. Although Istanbul has the highest number of clusters, there are
relatively small. However, the opposite pattern can be noted for the city of Paris
in which clusters are relatively large, but there are fewer compared to Istanbul.
Moreover, some similarities between cities can be observed. The cities of Chicago
and Los Angeles have relatively similar number and size of clusters.
Furthermore, we explored which categories are present in the clusters. We
selected the largest clusters across each city, those that are consisted from more
than 50 venues and calculated percentage of occurrences for each category. Pres-
ence, variation and distribution of categories inside clusters can give us valuable
input about venues semantic that are strongly connected by users movement.
Figure 4 presents the largest cluster in the city of Chicago classified by category.
From Fig. 4 we can notice high variety of categories, where the most present
category is the Food, followed by Shop&Service. It implies that people in this
cluster generally move between places related to food and shopping.
Fig. 4. Spatial distribution of categories across biggest cluster in Chicago
In the city of Chicago another large cluster is formed around O’Hare Interna-
tional Airport, in which the most present categories are Travel&Transport and
Food. From Fig. 5 we can notice how venues are spread in almost regular form
following Interstate 90 road, which is one of the main highways in the State of
Illinois. As can be seen, clusters are formed around spatially close or well con-
nected places with some categories frequently occurring together. Consequently,
we can classify clusters by dominant presence of one, two or even more categories.
To obtain global view and compare the cities, we performed hierarchical clus-
tering based on average profile of probability distribution of categories. Result
of hierarchical clustering in the form of dendrogram (Fig. 6) showed which cities
are similar, with colors of branches indicating how we could group them.
From Fig. 6 we can notice high similarity between US cities Chicago, New
York and Los Angeles, and European cities Paris and London, while Tokyo with
its community profiles stands between US and European cities. Another group
of similar cities include Istanbul, Jakarta and Singapore, while Seoul has unique
pattern completely different from all cities. To provide more details, we present
semantic profiles of venue clusters for four different cities Istanbul, Seoul, Chicago
Fig. 5. Airport cluster Chicago
Fig. 6. Hierarchical tree presenting the similarity and diversity between cities
and Tokyo (Fig. 7). From visual inspection of the profiles we can notice that cat-
egory Food has high peak in each city, while category Residence has low peak.
With comparative analysis between profiles we could notice some general trends
related to category variability between clusters. We can conclude that in Chicago
and Tokyo people are very likely to move between venues related to categories
Food and Shop&Services. In Istanbul people are very likely to move between
venues related to categories Food, Shop&Services and Professional&Other, while
Seoul has strong dominance of Food category. More specific city profiling is pos-
sible with exploring more detailed subcategories that are present in the clusters.
Fig. 7. Mean profile and standard deviation of categories probability distribution in

detected clusters
5 Conclusions
Mobility networks generated by users are valuable data source for exploring
urban spaces. By performing clustering over graphs made from mobility data we
gain deeper knowledge about grouping of mobility flows. With further investiga-
tion of venue semantics inside clusters we can detect location types and categories
that are frequently visited together by users. Detecting relations between clusters
and venues inside cluster can help us in building a recommendation application
that would serve users who are visiting new cities, based on their preferences.
Majority of venues in cluster are either spatially close or they are well connected
with transport infrastructure, indicating that users tend to move between loca-
tions in limited spatial distance forming in this way urban sub-spaces. Knowledge
about urban sub-spaces that stand out as entities could be very valuable input
for urban policy making and development, and also for developing new services.
Data set provided in Future Cities Challenge can be analysed in more
details. For future work we plan to perform clustering in higher time resolu-
tion (daily based, including day periods, such as morning, midday, afternoon,
night, overnight) to get more detailed insights into evolving patterns in clustering
results.
References
1. Foursquare categories. https://developer.foursquare.com/docs/api/venues/catego
ries. Accessed 20 May 2019
2. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E.: Fast unfolding of
communities in large networks. J. Stat. Mech: Theory Exp. 2008(10), P10008
(2008)
3. Cranshaw, J., Schwartz, R., Hong, J., Sadeh, N.: The livehoods project: utilizing
social media to understand the dynamics of a city. In: Sixth International AAAI
Conference on Weblogs and Social Media (2012)
4. Daggitt, M.L., Noulas, A., Shaw, B., Mascolo, C.: Tracking urban activity growth
globally with big location data. R. Soc. Open Sci. 3(4), 150688 (2016)
5. D’Silva, K., Noulas, A., Musolesi, M., Mascolo, C., Sklar, M.: If i build it, will
they come?: Predicting new venue visitation patterns through mobility data. In:
Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances
in Geographic Information Systems, p. 54. ACM (2017)
6. Harush, U., Barzel, B.: Dynamic patterns of information flow in complex networks.
Nat. Commun. 8(1), 2181 (2017)
7. Joseph, K., Tan, C.H., Carley, K.M.: Beyond “local”, “categories” and “friends”:
clustering foursquare users with latent “topics”. In: UbiComp (2012)
8. Karau, H., Konwinski, A., Wendell, P., Zaharia, M.: Learning Spark: Lightning-
Fast Big Data Analytics, 1st edn. O’Reilly Media, Inc., Sebastopol (2015)
9. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in net-
works. Phys. Rev. E 69, 026113 (2004)
10. Noulas, A., Scellato, S., Lathia, N., Mascolo, C.: Mining user mobility features for
next place prediction in location-based services. In: 2012 IEEE 12th International
Conference On Data Mining, pp. 1038–1043. IEEE (2012)
11. Pang, J., Zhang, Y.: Quantifying location sociality. In: Proceedings of the 28th
ACM Conference on Hypertext and Social Media, pp. 145–154. ACM (2017)
12. Preoţiuc-Pietro, D., Cohn, T.: Mining user behaviours: a study of check-in patterns
in location based social networks. In: Proceedings of the 5th Annual ACM Web
Science Conference, WebSci 2013, New York, NY, USA, pp. 306–315. ACM (2013)
13. Silva, T.H., Vaz de Melo, P.O., Almeida, J.M., Salles, J., Loureiro, A.A.: A com-
parison of foursquare and instagram to the study of city dynamics and urban social
behavior. In: Proceedings of the 2nd ACM SIGKDD International Workshop on
Urban Computing, p. 4. ACM (2013)
14. Truică, C.-O., Novović, O., Brdar, S., Papadopoulos, A.N.: Community detection
in who-calls-whom social networks. In: International Conference on Big Data Ana-
lytics and Knowledge Discovery, pp. 19–33. Springer (2018)
15. Yang, L., Durarte, C.M.: Identifying tourist-functional relations of urban places
through foursquare from Barcelona. GeoJournal (2019)
16. Zhang, Z., Zhou, L., Zhao, X., Wang, G., Su, Y., Metzger, M., Zheng, H., Zhao,
B.Y.: On the validity of geosocial mobility traces. In: Proceedings of the Twelfth
ACM Workshop on Hot Topics in Networks, p. 11. ACM (2013)
Innovative Technologies Applied to
Rural Regions
The Influence of Digital Marketing Tools
Perceived Usefulness in a Rural Region
Destination Image
Filipa Jorge1,2(&) , Mário Sérgio Teixeira1,2 ,

and Ramiro Gonçalves1,3
1
University of Trás-os-Montes e Alto Douro (UTAD), Quinta de Prados,
5001-801 Vila Real, Portugal
{filipajorge,mariosergio,ramiro}@utad.pt
2
CETRAD Research Unit (CETRAD), University of Trás-os-Montes e Alto
Douro, Quinta de Prados, 5000-801 Vila Real, Portugal
3
Institute for Systems and Computer Engineering, Technology and Science,
INESC TEC, Campus da FEUP, Rua Dr. Roberto Frias, 4200-465 Porto,
Portugal
Abstract. In rural destinations like Douro region, tourism activities may be

important to territories’ development because they use some of their natural
resources and endogenous products, develop another related local economy
business, and contributes to population fixation. Technologies can support the
promotion and distribution of these destinations, namely digital marketing tools.
The present study aims to analyze the influence that some of these tools per-
ceived usefulness, from the tourist point of view, have on a rural destination
image. To accomplish this, it was used a sample of 555 tourists visiting Douro
destination and data collected were analyzed using structural equation modeling.
The results demonstrated that tourist’s trust in tourism digital marketing tools
and their attitude toward these tools have an indirect effect on destination image.
Moreover, the perceived usefulness of some digital marketing tools utilized to
search, plan or purchase Douro destination, as website, booking and mobile
devices, has also a positive effect on this destination image. On another hand,
the perceived usefulness of a conventional channel, as travel agencies, has a
negative effect on this rural destination image. This research provided some
interesting insights for practitioners working in tourism destination marketing.
Keywords: Tourism Technology Information sources Digital marketing

Travel agencies
1 Introduction
Destination image is a relevant research field in tourism and, particularly, in tourism

marketing [1]. Several empirical studies demonstrated that destination image has a
relevant role in tourists’ destination-choice decision process [2], and in tourists’ loyalty
to a destination [3]. In spite the increasing number of empirical studies focused on
destination image and some of them have shown the influence of one particular type of
558 F. Jorge et al.
technology in tourists’ destination image perception [4], there is a gap in analysis of

some technologies together and their influence on destination image formation process.
Therefore, the present study intends to contribute to the literature with a deeper
understanding about the perceived usefulness importance of digital marketing tools set
in the destination image formation of a rural destination.
This research was developed in Douro Valley, which is a rural region demarcated for
Porto Wine production since 1756, and considered UNESCO World Heritage in 2001
due to its cultural landscape importance. This region is a wine tourism destination,
providing experiences related to the winemaking process, from harvest to tasting [5].
The paper is structured as follows: the next section presents and justifies the
conceptual model hypothesis; the third section presents methodological issues; the
fourth section presents research main results; and in the last section are conclusions,
limitations, and clues for future research.
2 Literature Review and Research Model
The research model presented in Fig. 1 intends to understand some antecedents of

destination image related to technology in a tourism context, as perceived usefulness of
some digital marketing tools in Douro destination travel decision in opposite to travel
agencies. The way as tourism digital marketing tools perceptions can influence the
perceived usefulness of each specific tool above mentioned is also studied.
Fig. 1. Research model
Tourist’s trust in tourism digital marketing tools can be defined as a consumer’s

subjective belief that an online provider will accomplish the obligations that they
understood [6]. Attitude toward a technology is positive or negative feelings that
individuals have about using this technology [7].
The Influence of Digital Marketing Tools 559
Tourism products platforms are increasing on the internet because tourism is an

industry that has the necessary characteristics to develop business and make transac-
tions through these digital channels [8]. In tourism products e-commerce, it was val-
idated the positive and significant effect of this technology trust in attitude toward the
same technology [6, 9]. Therefore, it is proposed the hypothesis below:
H1: Tourists’ trust in tourism digital marketing tools has a positive effect on their
attitude toward these tools.
Internet and digital marketing tools have changed tourism industry [10, 11] because
internet changed the way as tourists search for information, plan or purchase process of
tourism products [12]. Therefore, digital marketing tools perceived usefulness is the
degree that each individual believes that using this tool would enhance his performance
during the processes above referred [13]. Tourist’s trust in online platforms, as digital
marketing tools, remains to be an issue that requires further exploration [11]. Trust is
crucial for the tourism business because it has an influence on tourist’s intention to
purchase a product [14]. Besides, trust in internet and trust in some digital marketing
tools have demonstrated its influence in behavioral intentions, such as intention to buy
or intention to visit a destination [15–17]. Tourist’s trust in the internet to booking
tourism products showed a positive influence on the perceived usefulness of a par-
ticular tourism product e-commerce website [18]. If trust in tourism digital marketing
tools is important to trigger the tourist’s purchase behavior, it can be also important
during other phases of this decision process, as search for information about destination
and travel planning. In other words, the more trust tourists have in these tools, the
higher will be the digital marketing tool perceived usefulness during the search for
information, planning or purchase process. Considering all the exposed above, there are
suggested the five hypotheses below:
H2a: Tourists’ trust in tourism digital marketing tools has a positive impact on
website perceived usefulness.
H2b: Tourists’ trust in tourism digital marketing tools has a positive impact on e-
mail perceived usefulness.
H2c: Tourists’ trust in tourism digital marketing tools has a positive impact on s-
WOM perceived usefulness.
H2d: Tourists’ trust in tourism digital marketing tools has a positive impact on
booking perceived usefulness.
H2e: Tourists’ trust in tourism digital marketing tools has a positive impact on
mobile devices perceived usefulness.
In technology literature, it was verified that a positive attitude toward a particular
technology is determinant to its actual use or to the intention to use it [13]. The research
focused on technology use in tourism field also reveals empirical evidence that tourist’s
positive attitude toward technology has impact on their intention to use it [17, 19] or on
their intention to visit a destination [20]. Given that the use of tourism digital marketing
tools implies to use some technology, it is important to verify if a general and positive
attitude toward digital marketing tools influences tourist’s usefulness of each digital
marketing tool utilized during the search for information, plan and purchase process of
Douro destination. Therefore, the following five hypotheses are proposed:
560 F. Jorge et al.
H3a: Tourists’ attitude toward tourism digital marketing tools has a positive effect
on website perceived usefulness.
H3b: Tourists’ attitude toward tourism digital marketing tools has a positive effect
on e-mail perceived usefulness.
H3c: Tourists’ attitude toward tourism digital marketing tools has a positive effect
on s-WOM perceived usefulness.
H3d: Tourists’ attitude toward tourism digital marketing tools has a positive effect
on booking perceived usefulness.
H3e: Tourists’ attitude toward tourism digital marketing tools has a positive effect
on mobile devices perceived usefulness.
Tourists’ consumption behavior is changing because they require more information,
are becoming more independent, and make their tourism products purchases in mul-
tichannel [14]. Travel agencies were the most affected with the introduction of e-
commerce in the tourism industry [21], because the tourist’s behavior changes may
represent a challenge to travel agencies [12, 14]. Tourist’s motivations to use travel
agencies or online platforms are different, tourists that use travel agencies require a
personalized service and, generally, are more traditional [21, 22]. If tourists have a
positive attitude toward tourism digital marketing tools, it is expected that they would
perceive less usefulness on travel agencies to search for information, plan or purchase a
destination. Therefore, the hypothesis below is suggested:
H3f: Tourists’ attitude toward tourism digital marketing tools has a negative effect
on travel agencies perceived usefulness.
According to [22], the destination image is defined as “the perceptions held by
potential visitors about a destination” (p. 1). Tourism destination image research has
been done for many years [23], and several studies demonstrated that destination image
can be analyzed pre and post visit [4, 24]. Some destination image formation models
recognize the information sources as one of the determinants influencing tourists’
destination image, which means that tourist uses the information sources to formulate
perceptions and evaluations about a destination [1, 24].
Several digital marketing tools are presented in the conceptual model through their
perceived usefulness in Douro destination travel decision from tourists’ point of view.
All these tools are communication channels that destinations use to promote or dis-
tribute their tourism products, representing different information sources for potential
tourists [11, 12]. Besides, it was included a conventional information source, travel
agencies’ perceived usefulness, to analyze an opposite type of information source.
Considering this, it is proposed to evaluate the positive impact that each perceived
usefulness of digital marketing tools used to search, plan or purchase Douro destination
would have on this destination image. On the other hand, it also is evaluated if the
perceived usefulness of travel agencies to search for information, plan or purchase
Douro destination would have a positive impact on this destination image. Therefore,
the followed hypotheses are proposed:
H4a: Website perceived usefulness has a positive impact on destination image.
H4b: e-mail perceived usefulness has a positive impact on destination image.
H4c: s-WOM perceived usefulness has a positive impact on destination image.
H4d: Booking perceived usefulness has a positive impact on destination image.

H4e: Mobile devices perceived usefulness has a positive impact on destination
image.
H4f: Travel agencies perceived usefulness has a positive impact on destination
image.
3 Methodology
In order to conduct this research, it was implemented a survey method and quantitative
methods were applied. The cross-transversal method was used, and the primary data
was obtained through a personal questionnaire, applied personally during the tourists’
stay in Douro destination. The tourists visiting this region constitute the research
population and the sample is composed by 555 tourists that are in Douro at least two
days. The sample is stratified by tourists’ country of origin and county of stay and data
was collected in the summer of 2018. In Table 2 is presented the sample profile.
Table 1. Sample characteristics (n = 555)

Gender Male 255 45,9%
Female 300 54,1%
Age 20 or less 7 1,3%
21 to 30 87 15,7%
31 to 40 147 26,5%
41 to 50 129 23,2%
51 to 60 115 20,7%
61 to 70 59 10,6%
More than 70 11 2,0%
Education level Primary or below 24 4,3%
Secondary 243 43,8%
University 288 51,9%
Household income Under €1500 152 27,4%
€1501-€2500 229 41,3%
€2501-€3500 141 25,4%
Over €3500 33 5,9%
Purchasing experience Never 3 0,5%
of tourism products Less than a year 61 11,0%
online Between 1 to 3 years 140 25,3%
More than 3 years 351 63,2%
In the questionnaire, the constructs utilized were based on previous empiric liter-
ature and measured on a seven-point Likert scale, which ranges from 1 which means
“strongly disagree” to 7 which means “strongly agree”.
562 F. Jorge et al.
To test the hypothesized relationships between constructs, we used the Structural

Equations Modelling (SEM). The estimation of the model was achieved with Partial
Least Squares, implemented in the software SmartPLS - version 3.2.8 for Windows
[25]. The other analysis were performed with SPSS - version 22 for Windows [26].
4 Results
In conceptual model presented above are exposed nine constructs, eight of them
reflective and the other, destination image, formative. To evaluate the formative con-
struct measurement model, it is necessary to have into consideration the outer weights
and loadings of each indicator and their VIFs, expressed in Table 2. Three indicators
have outer weights not significant, but as these three indicators have loadings signif-
icant and greater than 0.5, they demonstrated their relevance and significance to the
construct. These indicators VIFs values are smaller than 3.0, which demonstrate that
there is no collinearity issues [27].
Table 2. Formative measurement model

Construct Outer Weights Loadings VIF
DI DI1 0,421*** 0,887*** 2,320
***
DI2 0,284 0,818*** 2,199
DI3 −0,045 n.s.
0,544*** 1,985
n.s.
DI4 0,086 0,722*** 1,966
DI5 −0,054n.s. 0,496*** 1,795
***
DI6 0,430 0,890*** 2,571
Note: NS = not significant. p < 0.10. **p
*
< 0.05.
***
p < 0.01.
For reflective constructs, we evaluate indicator reliability, internal consistency,

convergent validity, and discriminant validity. Indicator reliability was assessed based
on the criterion that loadings should be larger than 0.70 [28]. As showed in Table 5, the
instrument accomplish this criterion. Construct reliability was evaluated using Com-
posite Reliability (CR). As exposed in Table 3, all the constructs exhibit CR above
0.70, evidencing that they are reliable [29]. Internal consistency was tested using
Cronbach’s alpha. As expressed in Table 3, all the constructs have Cronbach’s alpha
above 0.70, suggesting that they are reliable [27]. Convergent Validity was assessed
using the Average Variance Extracted (AVE). According to [30], AVE should be
higher than 0.50 and all the constructs accomplish this criterion.
Table 3. Reflective measurement model

Construct Number of items Cronbach’s Alpha CR AVE
WB 5 0,932 0,949 0,787
EM 4 0,961 0,972 0,896
s-WOM 5 0,956 0,966 0,850
BK 4 0,949 0,963 0,868
MB 4 0,949 0,963 0,866
TA 4 0,984 0,988 0,955
ATT 4 0,901 0,931 0,771
TR 3 0,885 0,929 0,813
As demonstrated in Tables 4 and 5, all the constructs have convergent validity.

Discriminant validity have two criteria: (a) loadings are larger than cross loading,
which are confirmed in Table 5; and (b) the square root of AVE for each construct
should be higher than the correlations with all the others constructs, which are con-
firmed in Table 4 [30].
Table 4. Fornell-Larcker criterion

DI WB EM s-WOM BK MB TA ATT TR
DI F*
WB 0,573 0,887
EM 0,150 0,210 0,946
s-WOM 0,421 0,586 0,414 0,922
BK 0,490 0,633 0,331 0,546 0,932
MB 0,593 0,660 0,122 0,457 0,555 0,931
TA −0,275 −0,273 −0,005 −0,252 −0,300 −0,238 0,977
ATT 0,498 0,566 0,184 0,443 0,432 0,580 −0,160 0,878
TR 0,469 0,551 0,247 0,425 0,470 0,577 −0,130 0,678 0,902
Note: DI is a formative construct. Diagonal elements are the square root of AVE (bold);
below these values are the correlations with others constructs.
Table 5. Cross loadings

DI1 0,887 0,493 0,101 0,374 0,447 0,536 −0,231 0,419 0,413
DI2 0,818 0,474 0,142 0,351 0,393 0,475 −0,253 0,420 0,383
DI3 0,544 0,323 0,097 0,220 0,257 0,374 0,065 0,368 0,357
DI4 0,722 0,471 0,169 0,353 0,370 0,367 −0,237 0,347 0,285
DI5 0,496 0,288 0,076 0,176 0,255 0,343 0,059 0,325 0,326
DI6 0,890 0,512 0,141 0,355 0,425 0,548 −0,183 0,479 0,454
(continued)
564 F. Jorge et al.
WB1 0,498 0,879 0,159 0,481 0,543 0,559 −0,244 0,476 0,457
WB2 0,518 0,898 0,216 0,526 0,579 0,586 −0,224 0,493 0,463
WB3 0,508 0,888 0,185 0,531 0,567 0,597 −0,222 0,527 0,495
WB4 0,494 0,894 0,176 0,520 0,554 0,604 −0,259 0,499 0,495
WB5 0,521 0,876 0,193 0,537 0,561 0,580 −0,261 0,513 0,528
EM1 0,152 0,191 0,951 0,394 0,338 0,115 0,000 0,177 0,245
EM2 0,144 0,197 0,943 0,394 0,308 0,099 −0,027 0,175 0,225
EM3 0,120 0,197 0,940 0,404 0,273 0,111 0,003 0,179 0,230
EM4 0,149 0,210 0,952 0,376 0,332 0,136 0,005 0,165 0,237
s-WOM1 0,422 0,550 0,373 0,926 0,504 0,447 −0,220 0,419 0,430
s-WOM2 0,377 0,517 0,380 0,908 0,486 0,383 −0,246 0,399 0,370
s-WOM3 0,360 0,550 0,388 0,927 0,506 0,427 −0,225 0,413 0,394
s-WOM4 0,389 0,557 0,400 0,922 0,539 0,422 −0,249 0,398 0,373
s-WOM5 0,390 0,526 0,368 0,926 0,484 0,427 −0,225 0,411 0,387
BK1 0,453 0,602 0,329 0,510 0,935 0,514 −0,273 0,412 0,445
BK2 0,461 0,605 0,298 0,505 0,938 0,526 −0,277 0,400 0,431
BK3 0,451 0,573 0,311 0,511 0,930 0,513 −0,276 0,398 0,455
BK4 0,460 0,578 0,297 0,510 0,923 0,517 −0,292 0,399 0,419
MB1 0,546 0,601 0,101 0,398 0,502 0,934 −0,219 0,522 0,502
MB2 0,569 0,628 0,099 0,447 0,539 0,932 −0,231 0,560 0,564
MB3 0,543 0,600 0,128 0,424 0,517 0,924 −0,225 0,547 0,545
MB4 0,548 0,628 0,125 0,432 0,509 0,934 −0,209 0,527 0,537
TA1 −0,269 −0,273 −0,021 −0,247 −0,299 −0,239 0,978 −0,158 −0,132
TA2 −0,279 −0,280 −0,004 −0,252 −0,298 −0,239 0,978 −0,169 −0,127
TA3 −0,256 −0,247 −0,008 −0,244 −0,284 −0,223 0,972 −0,147 −0,121
TA4 −0,267 −0,266 0,014 −0,243 −0,290 −0,227 0,980 −0,149 −0,128
ATT1 0,415 0,461 0,200 0,392 0,364 0,483 −0,124 0,898 0,609
ATT2 0,440 0,569 0,131 0,408 0,391 0,520 −0,154 0,878 0,604
ATT3 0,447 0,470 0,138 0,374 0,368 0,530 −0,117 0,849 0,570
ATT4 0,446 0,482 0,177 0,380 0,393 0,502 −0,163 0,886 0,597
TR1 0,383 0,444 0,215 0,327 0,386 0,480 −0,057 0,575 0,861
TR2 0,431 0,509 0,215 0,396 0,445 0,536 −0,169 0,626 0,917
TR3 0,451 0,533 0,239 0,421 0,438 0,543 −0,121 0,631 0,927
The assessment of the formative construct measurement model and reflective

constructs indicates that there are adequate conditions to test the conceptual model
proposed.
The hypotheses formulated (a conceptual model of Fig. 1) were tested through the
estimation of a structural equation model using the partial least square (PLS). We also
tested possible mediator effects with indirect ratios and model confirmatory analysis to
estimate the validity of the constructs. To test significance was used bootstrapping
technique with 555 individuals, 500 subsamples and no sign change.
The proposed model explains 42.3% of the total variance in the destination image.
In Table 6 we can observe the estimated path coefficients and their significances.
Table 6. Structural relationship direct effects results

Hypothesis Path coefficient Sig Supported
H1 0,678*** (0,000) Yes
H2a 0,309*** (0,000) Yes
H2b 0,227*** (0,000) Yes
H2c 0,231*** (0,000) Yes
H2d 0,328*** (0,000) Yes
H2e 0,341*** (0,000) Yes
H3a 0,356*** (0,000) Yes
H3b 0,030n.s. (0,590) No
H3c 0,286*** (0,000) Yes
H3d 0,210*** (0,004) Yes
H3e 0,348*** (0,000) Yes
H3f −0,160*** (0,001) Yes
H4a 0,230*** (0,001) Yes
H4b 0,005n.s. (0,889) No
H4c 0,053n.s. (0,247) No
H4d 0,098* (0,074) Yes
H4e 0,341*** (0,000) Yes
H4f −0,088* (0,069) Yes, but with a negative coefficient
Note: n.s. = not significant. *p < 0.10. **
p < 0.05. ***
p < 0.01.
Almost all hypotheses were supported with three exceptions, in particular, H3b,
H4b, and H4c.
Concerning H3b, this hypothesis was not significant, which means that tourist’s
attitude toward tourism digital marketing tools had no effect on their perceived use-
fulness of e-mails received during the search for information, planning and purchasing
process of Douro destination. The hypothesis H4b was not also supported by our data,
verifying that the perceived usefulness of the e-mails received about Douro destination
during the searching for information, planning or purchasing process had no impact on
this destination image. Possible explanations for these results are that tourists didn´t
receive emails about Douro destination or are perceiving those e-mails eventually
received as not important, and so they do not valuate information presented in e-mails
to its destination choose process as they valuate their attitude toward tourism products
platforms or to construct their perceptions about the destination.
The result of hypotheses H4c reveal that s-WOM has no significate effect on
destination image, which means that social media contents about the destination have
566 F. Jorge et al.
low influence on Douro destination image formation, instead several previous empirical
researches verified this influence [31, 32]. A possible explanation to this result can be
that Douro destination is promoted by a DMO responsible for other territories and with
few contents on their social media pages related specifically with Douro destination
[33].
The hypothesis H4f was significant but not with the positive effect hypothesized in
the conceptual model. This result evidences the opposite influence of perceived use-
fulness on destination image between digital marketing tools and travel agencies. This
negative effect can be explained by our sample low level of travel agencies’ perceived
usefulness in search of information, plan, and purchase of Douro destination because
most individuals have the habit to purchase tourism products online, as can be verified
in Table 1. Another justification for this result can be in Douro destination dimension
because this research is focused on a small and rural destination, which can make it
difficult to find in travel agencies diverse tourism products offer about it.
In Table 7, indirect effects and their significances are presented.
Table 7. Structural relationship indirect effects results

Indirect effect Coefficient Sig Supported
TR!WB 0,241*** (0,000) Yes
TR!EM 0,020n.s. (0,593) No
TR!s-WOM 0,194*** (0,000) Yes
***
T!BK 0,142 (0,005) Yes
TR!MB 0,236*** (0,000) Yes
TR!TA −0,108*** (0,001) Yes
TR!DI 0,403*** (0,000) Yes
ATT!DI 0,250*** (0,000) Yes
Note: n.s. = not significant. *p < 0.10. **p < 0.05. ***
p < 0.01.
A recent literature review [11] identified online platforms consumers’ trust as a

challenge and an issue not well examined. According to our research results of indirect
effects, trust in tourism digital marketing tools has a positive and significant effect on
almost all the studied digital marketing tool perceived usefulness, except for e-mail.
These results show that the more tourists trust in tourism digital marketing tools, the
more they perceive the usefulness in websites, s-WOM, booking and mobile devices
for search information, plan or purchase Douro destination.
Trust in tourism digital marketing tools has no indirect effect on e-mail perceived
usefulness, although the direct relationship between these two constructs is positive and
significant. This indirect effect has into consideration the direct effect of trust in attitude
toward digital marketing tools and the effect of attitude toward these tools on e-mail
perceived usefulness.
Besides, trust in tourism digital marketing tools and attitude toward these tools
reveal a positive and significant indirect effect on destination image.
5 Final Considerations
Rural destinations use the same information sources to promote themselves than other
bigger destinations, as capital cities. However, these bigger destinations have more
awareness than rural destinations. Therefore, rural destinations must stand out to be
valued and purchased by potential tourists. To accomplish these objectives, technology,
in general, and digital marketing tools, in particular, could have a great importance
because it can act as trigger for the desired tourist behavior, and, allows a larger
diffusion to potential tourist’s with a lower cost, compensating the smaller size and
marketing capacity of rural destinations.
This research provides a theoretical contribution revealing the influence that some
digital marketing tools perceived usefulness, in particular, websites, booking, and
mobile devices, have in the destination image. On the other hand, perceived usefulness
of travel agencies as an information source about Douro destination had a significant
relation with this rural destination image, but contrary to expectations, in a negative
way. This is a contribute to the discussion about travel agencies’ role in the decision
process on an internet era and their influence on destination image formation because
today’s’ tourists are accustomed to using online platforms and tools to search infor-
mation, plan and purchase destinations.
Moreover, tourist’s trust in digital marketing tools in the tourism products purchase
process and their attitude toward the same tools influenced the perceived usefulness of
the individual tools analyzed in this research, except for e-mail. This exception may
suggest that tourists may be bored with receiving e-mails that promote tourism prod-
ucts, thus they no longer recognize it so useful as other digital marketing tools.
Trust and attitude toward tourism digital marketing tools reveal indirect effects on
Douro destination image. This last result seems to indicate that tourist’s trust and
positive attitude toward these tools in general, during their search for information, plan
and purchase process of Douro destination, influences its image toward some of these
tools’ usefulness.
As practical contributions, this research results about the influence of digital
marketing tools perceived usefulness on Douro destination image, indicate that private
and public organizations related to tourism destination marketing must improve their
digital marketing strategies related to destination communication processes. The use of
innovative technologies associated with these digital marketing tools allows the cre-
ation of a rural destination differentiation. In particular, the evidence about the impact
that digital marketing tools as website, booking and mobile devices can have on
destination image indicates the increasing importance that these technologies must have
in their communication strategies and actions.
Besides, not significant or negative results can also leave some indications to
practitioners. For instance, the not significant influence of s-WOM on Douro desti-
nation image indicates that public or private entities responsible for tourism products or
destinations promotion should be concerned to have an active presence on social media
platforms. These entities should encourage contents publication about their destination
by tourists that recently visited the destination through the use of destination labels,
such as hashtags, and should sensitize tourists during their stay in a destination for the
568 F. Jorge et al.
importance of create and sharing positive contents related to the destination or some of
its tour operators. On the other hand, the negative result about the influence of travel
agencies on Douro destination image indicates that destinations should also work with
travel agencies to have available their tourism products and to promote them to their
customers. Destination management organizations or operators may also provide
information or training travel agencies so they have real and updated knowledge about
that destination, which may improve its image perceived by their customers.
For this research, data only was collected in one moment and, for that reason, we
can’t explore the evolution in destination image tourists’ perceptions. Future studies
should consider analyzing that evolution across time in longitudinal research or when
tourists are exposed to different technological information sources. Besides, further
research should provide deeper analysis to understand the reasons or motivations that
explain why some digital marketing tools have such influence in rural destinations
image, as was in the case of Douro.
Acknowledgments. This work is financed by the ERDF – European Regional Development

Fund through the Operational Programme for Competitiveness and Internationalisation -
COMPETE 2020 Programme and by National Funds through FCT - Fundação para a Ciência e a
Tecnologia within project POCI-01-0145-FEDER-031309 entitled “PromoTourVR - Promoting
Tourism Destinations with Multisensory Immersive Media”.
References
1. Baloglu, S., McCleary, K.W.: A model of destination image formation. Ann. Tour. Res. 26
(4), 868–897 (1999)
2. Isaac, R.K., Eid, T.A.: Tourists’ destination image: an exploratory study of alternative
tourism in Palestine. Curr. Issues Tour. 22(12), 1499–1522 (2019)
3. Wu, C.W.: Destination loyalty modeling of the global tourism. J. Bus. Res. 69(6), 2213–
2219 (2016)
4. Zhang, H., Fu, X., Cai, L.A., Lu, L.: Destination image and tourist loyalty: a meta-analysis.
Tour. Manag. 40(February), 213–223 (2014)
5. Martins, J., Gonçalves, R., Branco, F., Barbosa, L., Melo, M., Bessa, M.: A multisensory
virtual experience model for thematic tourism: a port wine tourism application proposal.
J. Destin. Mark. Manag. 6(2), 103–109 (2017)
6. Agag, G.M., El-Masry, A.A.: Why do consumers trust online travel websites? Drivers and
outcomes of consumer trust toward online travel websites. J. Travel Res. 56(3), 347–369
(2016)
7. Venkatesh, V., Morris, M.G., Davis, G.B., Davis, F.D.: User acceptance of information
technology: toward a unified view. MIS Q. 27(3), 425–478 (2003)
8. Bonsón Ponte, E., Carvajal-Trujillo, E., Escobar-Rodríguez, T.: Influence of trust and
perceived value on the intention to purchase travel online: integrating the effects of assurance
on trust antecedents. Tour. Manag. 47, 286–302 (2015)
9. Ayeh, J.K., Au, N., Law, R.: Investigating cross-national heterogeneity in the adoption of
online hotel reviews. Int. J. Hosp. Manag. 55, 142–153 (2016)
10. Buhalis, D., Law, R.: Progress in information technology and tourism management: 20 years
on and 10 years after the Internet-The state of eTourism research. Tour. Manag. 29(4), 609–
623 (2008)
11. Navío-Marco, J., Ruiz-Gómez, L.M., Sevilla-Sevilla, C.: Progress in information technology
and tourism management: 30 years on and 20 years after the internet - Revisiting Buhalis &
Law’s landmark study about eTourism. Tour. Manag. 69, 460–470 (2018)
12. Del Chiappa, G., Alarcón-Del-Amo, M.-D.-C., Lorenzo-Romero, C.: Internet and user-
generated content versus high street travel agencies: a latent gold segmentation in the context
of Italy. J. Hosp. Mark. Manag. 25(2), 197–217 (2016)
13. Davis, F.D.: Perceived usefulness, perceived ease of use, and user acceptance. MIS Q. 13(3),
319–339 (1989)
14. Rajaobelina, L.: The impact of customer experience on relationship quality with travel
agencies in a multichannel environment. J. Travel Res. 57(2), 206–217 (2018)
15. Sahli, A.B., Legohérel, P.: The tourism Web acceptance model. J. Vacat. Mark. 22(2), 179–
194 (2016)
16. Agag, G., El-Masry, A.A.: Understanding consumer intention to participate in online travel
community and effects on consumer intention to purchase travel online and WOM: an
integration of innovation diffusion theory and TAM with trust. Comput. Hum. Behav. 60,
97–111 (2016)
17. Besbes, A., Legohérel, P., Kucukusta, D., Law, R.: A cross-cultural validation of the tourism
web acceptance model (T-WAM) in different cultural contexts. J. Int. Consum. Mark. 1530
(Apr), 1–16 (2016)
18. Jeng, C.-R.: The role of trust in explaining tourists’ behavioral intention to use e-booking
services in Taiwan. J. China Tour. Res. 15(4), 478–489 (2019)
19. Morosan, C.: Toward an integrated model of adoption of mobile phones for purchasing
ancillary services in air travel. Int. J. Contemp. Hosp. Manag. 26(2), 246–271 (2014)
20. Chung, N., Lee, H., Lee, S.J., Koo, C.: The influence of tourism website on tourists’
behavior to determine destination selection: a case study of creative economy in Korea.
Technol. Forecast. Soc. Change 96, 130–143 (2015)
21. Devece, C., Garcia-Agreda, S., Ribeiro-Navarrete, B.: The value of trust for travel agencies
in achieving customers’ attitudinal loyalty. J. Promot. Manag. 21(4), 516–529 (2015)
22. Hunt, J.D.: Image as a factor in tourism development. J. Travel Res. 13(3), 1–7 (1975)
23. de la Hoz-Correa, A., Muñoz-Leiva, F.: The role of information sources and image on the
intention to visit a medical tourism destination: a cross-cultural analysis. J. Travel Tour.
Mark. 36(2), 204–219 (2019)
24. Beerli, A., Martín, J.D.: Factors influencing destination image. Ann. Tour. Res. 31(3), 657–
681 (2004)
25. Ringle, C.M., Wende, S., Becker, J.-M.: SmartPLS 3. SmartPLS GmbH, Boenningstedt (2015)
26. IBM Corp: IBM SPSS Statistics for Windows, Version 25.0. IBM Corp, Armonk, NY
27. Hair, J.F., Hult, G.T.M., Ringle, C., Sarstedt, M.: A Primer on Partial Least Squares
Structural Equation Modeling (PLS-SEM), 2nd edn. SAGE Publications Inc., Thousand
Oaks (2017)
28. Churchill, G.A.: A paradigm for developing better measures of marketing constructs.
J. Mark. Res. 16(1), 64 (1979)
29. Straub, D.W.: Validating instruments in MIS research. MIS Q. 13(2), 147 (1989)
30. Fornell, C., Larcker, D.F.: Evaluating structural equation models with unobservable
variables and measurement error. J. Mark. Res. 18(1), 39 (1981)
31. Jalilvand, M.R., Samiei, N.: The effect of electronic word of mouth on brand image and
purchase intention. Mark. Intell. Plan. 30(4), 460–476 (2012)
32. Abubakar, A.M., Ilkan, M., Al-tal, R.M., Eluwole, K.K.: eWOM, revisit intention,
destination trust and gender. J. Hosp. Tour. Manag. 31, 220–227 (2017)
33. Jorge, F., Teixeira, M.S., Fonseca, C., Correia, R.J., Gonçalves, R.: Social media usage
among wine tourism DMOs. In: Marketing and Smart Technologies, pp. 78–87 (2020)
Ñawi Project: Visual Health for Improvement
of Education in High Andean Educational
Communities in Perú
Xavi Canaleta1, Eva Villegas1, David Fonseca1(&), Rafel Zaragoza2,

Guillem Villa1, David Badia1, and Emiliano Labrador1
1
La Salle – Universitat Ramon Llull, 08022 Barcelona, Spain
{xavier.canaleta,eva.villegas,david.fonseca,
guillem.villa,david.badia,
emiliano.labrador}@salle.url.edu
2
Centre de Visió i Audiologia, 08250 Sant Joan de Vilatorrada, Spain
[email protected]
Abstract. The UN General Assembly adopted the 2030 Agenda for Sustain-
able Development, an action plan in support of people, the planet and prosperity.
Within the 17 Sustainable Development Objectives (SDO), we have quality
education (fourth objective) and the reduction of inequalities (the tenth SDO).
Rural communities tend to be one of the most disadvantaged environments and
development education is one of the most effective mechanisms to alleviate
these inequalities. In this framework, the Urubamba Project for International
Cooperation is presented, and within it, the Ñawi Project for visual education.
Both projects are developed in the Andean areas of Cusco (Peru) through uni-
versity community participation. In these lines of action, Information and
Communication Technologies (ICTs) are presented as a relevant and effective
element to achieve their objectives: introduction of ICTs in education to create
inclusive socio-educational environments and ICTs as an analysis tool for the
visual education project (prevention and correction). This work focuses on the
Ñawi Project and the satisfactory results that have been obtained.
Keywords: Visual health Education High Andean communities ICT in

education Inclusive learning
1 Introduction
Rural regions, long forgotten in the development of innovative initiatives, begin to

attract the focus of interest in social projects that approach their objectives on improving
local and social resources. The new technologies and their potential capacity to support
and assist in new solutions and educational, social, professional actions, etc., facilitate
the development of projects, make them viable and improve their efficiency in rural
areas with few resources. In this sense, identifying and evaluating the indicators that
allow improving the conditions of a historically unprotected environment is essential not
only for the region but for the approach to be carried out from a scientific, controlled and
replicable point of view that allows its export to other regions and/or contexts.
Ñawi Project: Visual Health for Improvement of Education 571
Within the action’s framework that the UNESCO cathedra in Education, Devel-
opment and Technology of the Ramon Llull University has been promoting, since
2013, La Salle Campus Barcelona (together with its ONGD Proide Campus) has been
developing the Urubamba Project for International Cooperation [1]. This project was
created in coordination with La Salle Urubamba Institute in the Incas Sacred Valley
(Cusco, Peru). Its initial objective has focused on performing different actions always
with a common effort: education for development. And at this point, Information
Technology has been an element of motivation, cost efficiency and effectiveness in
enacted actions. The High Andean educational communities, located at more than
4,000 meters of altitude, are a point of special interest in this project given its low level
of resources and its limited growth.
In this way, different projects have been implemented in the seven campaigns
between 2013 and 2019. The Haku Wiñay Project (learning together) enhanced the
training of teachers and students. The e-Yachay Project (e-learning) introduced
Information Technology with the objective of creating inclusive socio-educational
environments. The Kurku Kallpanchay Project (strengthening the body) enhanced
gender equality and cooperative work through physical education. And more recently
the Willachikuy Project (communication) created emergency communication structures
for isolated High Andean educational communities.
And it was precisely the e-Yachay Project that motivated the appearance of Ñawi
Project. The introduction of ICT in High Andean educational institutions causes
changes in students learning habits. The introduction of computer classrooms, the use
of educational software and the use of office automation, can lead to different actions
and reactions in the children`s visual habits. This research has focused on the results
obtained from the Ñawi (eye) project of visual education. The students’ vision is a
crucial issue that directly affects their learning. A high percentage of students learning
deficiencies are due to their vision problems that have not been detected or corrected.
The technology has a relevant role to analyze the results obtained, extract relevant
information automatically and, through data mining techniques, obtain knowledge that
facilitates the improvement of the project’s actions in the coming year’s campaigns.
The article continues in Sect. 2 with the contextualization of visual field termi-
nology. Section 3 identifies the method of the project application in the study envi-
ronment. Section 4 presents the main results obtained from the analysis of collected
data, to finish with Sect. 5 of conclusions and subsequent acknowledgments.
2 Contextualization
A large part of the Urubamba Project focuses on actions in the neediest areas of the
Cusco zone (Peru): the peasant communities. More specifically, the project focuses on
High Andean educational institutions. These institutions are public schools that are
located at more than 4,000 m of altitude and that might be close to a peasant com-
munity or be isolated in the Andean mountains. These institutions are the only means
that peasant communities have for bringing education to their sons and daughters. In
addition to the harsh orographic and climatic conditions, they are schools that have few
resources to carry out the educational work. Many of these institutions do not have any
572 X. Canaleta et al.
type of medical coverage and much less have access to the possible eye examination of
their students.
Visual health is an indispensable element in the teaching - learning processes. The
lack of visual problems detection and the misinformation in this field, are a common
feature throughout this high Andean zone. We define visual health [2] as a visual
system devoid of diseases in the sense of sight and in the eyes structures, at the same
time that this system enjoys good visual acuity. Visual acuity [3] is defined as the
ability to identify letters, numbers or symbols for a specific and standardized distance
by an eye test.
Even with good visual health, refractive defects such as myopia, farsightedness or
astigmatism can be suffered. In most cases, the refractive defect is due to genetics itself,
but there are cases in which a refractive defect may appear due to poor visual habits.
According to the WHO (World Health Organization) [4], less developed regions have a
higher proportion of visual defects. The child population is the most delicate, since they
are in a learning period and may have difficulties both in school progress and the
psychosocial development process. Visual ergonomics [5] is defined as the correct
work environment adaptation to the visual needs, it means, adopting appropriate body
postures to favor visual tasks, correct lighting, adequate working distances and
guidelines and pauses for visual rest.
In the Ñawi Project, refractive defects are the priority to be taken into account,
since the improvement of visual ergonomics is essential for the improvement in edu-
cation. Thus, the project must focus on two main objectives: the detection of eye
problems and the education of the population to prevent, avoid and detect these
problems.
3 Application
The actions were carried out in two different locations: the population of Urubamba and
the 50,187 Educational Institution of Pampallacta. Urubamba is a population located in
the valley, at 2,980 m of altitude, with approximately 6,000 inhabitants. Pampallacta is
an isolated educational community located in the high Andean area at 4,000 m above
sea level without any nearby peasant community. These locations were chosen to be
able to contrast data from two realities that are geographically close but far apart in
terms of situation and means.
The educational function of the Ñawi Project aims to raise awareness among stu-
dents and families of the importance of sight and revisions to avoid, prevent or min-
imize visual damage. This was one of the essential actions ended by the Urubamba
2017 team. For this, a small manual was made in which the most relevant aspects to be
taken into account for a correct visual health were indicated:
• Eye anatomy: explanation of the meaning of a healthy eye and various refractive
defects such as myopia, farsightedness and astigmatism.
• Visually correct positions: recommendations were indicated about position, light-
ing, working distance, reading angle and reset times.
• Eye diseases: alerted when it would be necessary to go to the doctor, indicating the
most relevant pathologies considering the patient’s environment, (altitude and,
therefore, higher incidence of solar rays): photophobia, conjunctivitis, keratitis,
cataracts, pingueculae, pterygium. We report on common effects from eye trauma
such as blows with or without a wound and how to act when fluid enters the eye. It
also refers to the suspicion of retinopathies or other pathologies.
To optimize the large volume of revisions to be performed, the following procedure
was done:
• Step 1: List of people to graduate and collection of personal data to be able to carry
out a personalized follow-up.
• Step 2: Evaluation of 3D vision using the Titmus Stereo Test [6].
• Step 3: Evaluation of color vision using the Ishihara test [7].
• Step 4: Only done in Urubamba for logistical issues. Evaluation of ocular move-
ment through cover test [8] and PPC (Next Convergence Point) [9], where we get a
basic idea of your binocular vision.
• Step 5: Perform an optometric examination by retinoscopy and/or subjective test by
the patient. The subjective examination of refraction by the patient, consists of a
monocular exchange of lenses until the patient refers to having found the lens that
provides a better visual quality, that is, a better visual acuity (VA). This is how we
obtain patient’s refraction and visual acuity (VA).
– In the case of Urubamba, it is the patient whom, subjectively, appreciates visual
acuity using the LogMAR [10] eye test in a numerical format in which the
patient must indicate the numbers he or she is seeing.
– In the case of Pampallacta and due to communication problems and interpre-
tation of the comments (since the boys and girls are Quechua speakers and could
not have a translator during the whole performance), the patient’s AV was
assessed using the Snellen optotype table [11]. In this test, the patient must
indicate in what position the letter “E” of the test is (up, down, right or left).
• Step 6: In the event that the patient needs visual correction using glasses, these are
searched in the database, where the inventory of all donated glasses is located, and
once the corresponding one is found, it is delivered to the patient together with an
explanation of the correction made and for what or at what times he/she should use
the glasses.
• Step 7: Once the glasses are delivered, an optometric approximation analysis is
done to assess the optometric similarity between the glasses delivered and the
review performed since the eyeglasses are not always available with 100% of the
appropriate graduation with respect to the refraction found.
The images shown in Fig. 1 reflect the system used for optometric revision in the
high Andean community of Pampallacta.
Fig. 1. Pampallacta. Girl indicating the direction of the Snellen Test. Retinoscopy with test
glasses and sciascopy rule.
4 Results
The results presented correspond to the actions done by the Ñawi Project in the third
week of July 2017. The actions were carried out in two different locations: the pop-
ulation of Urubamba and the Pampallacta Educational Institution. In Urubamba the
Project was handled through the collaboration of the Public Superior Institute of La
Salle Urubamba.
In the campaign carried out between January and May 2017 in the city of Barcelona
and its surroundings, selfless donations from different institutions and individuals were
obtained. A total of 1,500 prescription and sunglasses were collected, and after a
subsequent selection where very specific graduations and useless glasses were elimi-
nated, a total of 713 prescription and 300 sunglasses were transferred to destination.
During the performance in the Incas Sacred Valley, a total of 328 glasses were
delivered: 139 prescription glasses and 189 sunglasses. 269 eye exams were performed
for 5 days. In the Pampallacta, there were a total of 99 and the population of Urubamba
170.
4.1 Qualitative Results

The qualitative results are extracted from the two final steps of the methodology, steps
6 and 7. Step 6 performs the correction of the visual defect through the delivery of
glasses and the explanation of their correct use, or simply the optometric revision with
a correct result.
In the last phase, step 7, after the revisions, an optometric approximation analysis is
done, which allows to assess the glasses adequacy degree delivered with the graduation
made. The mark has a range of 1 to 10. The value of 1 refers to an optometric
approximation between the needs of the patient and the glasses delivered very low, and
10 to a delivery of glasses that allows obtaining a 100% correction. The notes have
very high values, specifically 8.4 in the case of Pampallacta and 8.7 in the case of
Urubamba.
These results can be considered highly satisfactory given the premises of the Ñawi
Project: all the glasses come from altruistic and previously graduated donations. It is
not possible to graduate glasses at the destination due to the characteristics of the
Project.
4.2 Quantitative Results

Quantitative data are based on the results collected in steps 1, 2, 3, 4 and 5 of the
method described in the previous section. The most relevant results are described
below.
Data collected in first step.
Age of the analyzed population (see Table 1 and 2):
• In Pampallacta, the patients reviewed have an average age of 27 years, with a
minimum age of 2 and a maximum age of 67 years. In the high Andean educational
institution, it have been reviewed both, elementary and secondary school students
and teachers, and families who requested it.
• On the other hand, in Urubamba, the average age was 22 years, with a minimum
age of 3 and a maximum of 72. This difference is due to the fact that the students
reviewed mostly correspond to those of higher careers and also to their families and
teachers.
Type of Patients. The patients are mainly students, but revisions are also made to the
faculty of educational communities and family members who request an eye examina-
tion. Both in Pampallacta and Urubamba the percentages are 63% women and 37% men.
Table 1. Quantitative results in Pampallacta.

Sample Age Min Max Revision OK
Students 60 9 2 16 85%
Faculty & family 39 41 20 67 62%
Table 2. Quantitative results in Urubamba.

Sample Age Min Max Revision OK
Students 97 16 3 23 31%
Faculty & Family 73 42 24 72 18%
Data collected in step 2, 3, 4 and 5.

Parameters collected from optometry (see Table 1, 2 and 3).
The first interesting global data to analyze is the number of reviews where it was
detected that patients had good eye health and a low index of refractive problems (see
Table 1 and 2). The results are drastically different in Pampallacta and Urubamba.
While in Pampallacta there are good results in terms of visual health and refractive
problems (85% in students and 62% in adults), in Urubamba there are significantly
different results (31% in students and 18% in adults). The first conclusion is that
individuals in the high Andean areas have greater visual health than those living in the
Urubamba population. But this statement can - and should be contextualized - with the
fact that in the educational institution of Pampallacta, it was practically the entire
community that was reviewed, while in Urubamba the assistance to the revisions was
voluntary. It is possible that students, teachers and families attending Urubamba were
already aware that they had a problem and those who had the perception of having
good vision no longer requested the review. In any case, visual health in the high
Andean zone seems clearly better than in the population of the valley.
Visual Acuity (VA) is a parameter that measures the quality of the patient’s vision,
in the project it is measured by a decimal scale where 1 corresponds to a vision of
100% and 0.1 to a vision of only 10%. In the case of Pampallacta and due to the
difficulty of communicating with the Quechua language, a Test E is performed where
the letter E (in uppercase) is seen in different sizes and orientations. In this case, the
patient is asked to indicate the orientation of the letters according to what he thinks he
sees.
In Pampallacta, a VA between 0.6 and 1 is obtained, but instead in Urubamba it is
between 0.2 and 1. What these data explain is that the individuals reviewed in the
Pampallacta educational community have a better overall vision than those reviewed in
the population from Urubamba.
Table 3. Optometric results in Urubamba and Pampallacta.

Pampallacta Urubamba
Alteration in 3D vision 23% 35%
Chromatic alteration 6% 2%
PPC correctly Not realized 86%
Cover test correctly Not realized 95%
Glasses delivered 24 124
Total revisions 99 170
From these data, it is worth mentioning that a significant increase in refractive

defects, especially myopia, is being observed in more evolved populations because the
use of near vision is abused at growing ages [12]. Closed classrooms and the use of
mobile devices and computers create an adaptation to the environment that can lead to
an increase in visual problems [13]. Relating this fact to the results obtained in Pam-
pallacta and Urubamba, it can be deduced that Pampallacta students are less exposed to
technological means than those of Urubamba [14]. This explains the lower incidence of
refractive problems. Other sources also value that children who spend more hours
outdoors reduce the chances of suffering refractive problems.
4.3 Data Mining for the Improvement of the Following Actions

In addition to the use of ICT for the extraction of qualitative and quantitative results of
the Ñawi Project, a further step has been taken using representative machine learning
techniques [15] to extract information from the 2017 campaign.
Specifically, profile detection techniques are being used, using data mining with
clustering algorithms (K-Means). Thanks to these techniques you can obtain more
precise information about the typologies detected in the reviews. In this way. In
addition to the knowledge extracted, the detected groups will allow a better selection of
the prescription glasses for the following performances (Ñawi 2020) and, thus, be able
to have the maximum number of glasses of the appropriate graduation to be able to give
a better service.
5 Conclusions
The objective of this study is based on obtaining a first approximation of the Ñawi
Project as visual health improvement integrated in the actions of the Urubamba Project.
In order to improve the process efficiency in the following campaigns, it was
necessary to perform an analysis of data obtained in the 2017 campaign to be able to
provide better coverage of the optometric needs of the population of Sagrado Valley.
ICTs provide us with the tools to perform the quantitative analyzes.
The methodology applied requires a small adaptation according to the community
in which it is carried out, adapting both the temporality in each patient and the opto-
metric review process that is performed.
The optometric approach note is essential to assess the percentage of success in the
patient’s visual correction, but is based on expert’s interpretation. A future line of work
could be to create an automatic metric to calculate this indicator.
As a future line for upcoming campaigns, it will be key to be able to incorporate the
follow-up of the glasses delivered (graduated and sunglasses), in order to assess the
impact and changes of social habits in the High Andean Educational Institutions of
Cusco.
Acknowledgment. To the support of the Secretary of Universities and Research of the

Government of Catalunya’s Business and Knowledge Department for the help regarding 2017
SGR 934.
References
1. Canaleta, X., Badia, D., Vadillo, J.L., Maguiña, M.: Proyecto Urubamba. Lasallistas sin
fronteras en un Proyecto socioeducativo inclusivo, 1º Edición, Junio 2019. Ed. Vanguard
Gràfic (2019). ISBN 978-84-697-4132-O
2. Abel, R.: The Eye Care Revolution: Prevent and Reverse Common Vision Problems.
Kensington Books, New York (2004)
3. Fermandois, T.: Agudeza Visual, 1 (2011)
4. Organización Mundial de la Salud: Prevención de la ceguera y la discapacidad visual

evitables, 62a Asam. Mund. la Salud (2009)
5. Anshel, J.R.: Visual ergonomics in the workplace. AAOHN J.: Off. J. Am. Assoc.
Occup. Health Nurs. 55, 414–420 (2007)
6. Fawcett, S.L., Birch, E.E.: Validity of the Titmus and Randot circles tasks in children with
known binocular vision disorders. J. AAPOS 7, 333–338 (2003)
7. Birch, J.: Efficiency of the Ishihara test for identifying red-green colour deficiency.
Ophthalmic Physiol. Opt. 17, 403–408 (1997)
8. Elliott, D.: Clinical Procedures in Primary Eye Care (2007)
9. Lara, F., Cacho, P., García, Á., Megías, R.: General binocular disorders: prevalence in a
clinic population. Ophthalmic Physiol. Opt. 27, 70–74 (2001)
10. Salt, A.T., Wade, A.M., Proffitt, R., Heavens, S., Sonksen, P.M.: The Sonksen logMAR test
of visual acuity: I. Testability and reliability. J. AAPOS 11, 589–596 (2007)
11. Sue, S.: Test distance vision using a Snellen chart. Community Eye Heal. J. 20, 52 (2007)
12. Dolgin, E.: The myopia boom. Nature 519, 276 (2015)
13. Holden, B.A., et al.: Global prevalence of myopia and high myopia and temporal trends from
2000 through 2050. Ophthalmology 123, 1036–1042 (2016)
14. Williams, K.M., et al.: Increasing prevalence of myopia in Europe and the impact of
education. Ophthalmology 122, 1489–1497 (2015)
15. Schapire, R.: COS 511: Theoretical Machine Learning (2008)
Building Smart Rural Regions: Challenges
and Opportunities
Carlos R. Cunha1(&) , João Pedro Gomes2 , Joana Fernandes3 ,

and Elisabete Paulo Morais1
1
UNIAG, Instituto Politécnico de Bragança,
Campus de Santa Apolónia, 5300-253 Bragança, Portugal
{crc,beta}@ipb.pt
2
Instituto Politécnico de Bragança, Campus de Santa Apolónia,
5300-253 Bragança, Portugal
[email protected]
3
CiTUR, Instituto Politécnico de Bragança, Campus de Santa Apolónia,
5300-253 Bragança, Portugal
[email protected]
Abstract. Rural regions are a typology of region rooted around the world. Its
identity and matrix are differentiated from the most urbanized regions. Asso-
ciated with rural areas is a strong negative feeling of depopulation, undeveloped
business fabric, less wealth and less ability to attract investment and where
public and private services from various sectors of activity are not concentrated.
This reality cannot be socially accepted and must be fought for greater equity
within countries. To leverage this change, rural regions will have to become co-
competitive and attractive regions. In order for this transformation to take place,
Information and Communication Technologies (ICT) play a major role. This
article characterizes the rural regions in their demographic and economic
dimensions, emphasizing the case of the Northeast region of Portugal. Analyse
and review a set of fundamental vectors where ICT can be a key driver and
enabler for smart rural regions to be created. Finally, it is presented a conceptual
model of what can be a smart rural region.
Keywords: Smart rural region ICT Cooperation Model
1 Introduction
In [1] is argued that the definition of rural areas is a much-discussed issue and it is
difficult to have a commonly accepted definition as various countries have different
indicators to define rural areas. The official document of the European Union (EU), the
“Proposal for a Council Regulation on support to Rural Development by the European
Agricultural Fund for Rural Development” identifies areas as rural if the population
density is below 150 inhabitants per one square kilometer [1].
The report [2] identifies the indicators used in selected countries to define rural
areas like: Australia: Population clusters of fewer than 1,000 people, excluding certain
areas such as holiday resorts; Austria: Towns of fewer than 5,000 people; Canada:
580 C. R. Cunha et al.
Places of fewer than 1,000 people, with a population density of fewer than 400 per
square kilometer; Denmark and Norway: Agglomerations of fewer than 200 inhabi-
tants; England and Wales: No definition but the Rural Development Commission
excludes towns with more than 10,000 inhabitants; France: Towns containing an
agglomeration of fewer than 2,000 people living in contiguous houses or with not more
than 200 meters between the houses; Portugal and Switzerland: Towns of fewer than
10,000 people; Malaysia: Areas less than 10,000 people; India: Locations with a
population of less than 10,000 people.
Over the past decades, we are experiencing a massive migration of population to
the urban zones, potentiating the desertification of the rural areas. These rural areas are
deprived of young population and workplaces.
In EU 28% of the population lives of the in rural areas [3], the majority of European
countries are called the “countries of the elderly” and its rural environments are being
affected by depopulation and social exclusion [4]. This is for sure one of the biggest
social problems left in the EU [5]. In rural regions despite of those factors, they also
have low-density population thus, dispersing regions; scarcity of public transports; low
accessibility services; small business; and high levels of illiteracy, especially in terms
of digital illiteracy.
Rural regions have several differences compared to urban regions. Rural regions
have a lower population density, an older population, greater economic weight of
agricultural activities, greater fragility of the business fabric, more difficult to access
public services, less attractiveness for investment, among others. By contrast, rural
regions have much to offer, including a sense of community, affordable housing prices,
access to open and green spaces and top-quality products. Often have great tourism
potential (although not properly explored), superior food production, and an important
material and especially immaterial heritage. This reality reveals in itself a set of
challenges and opportunities that is important to discuss to enable the development of
rural regions.
Rural regions can make use of the reorientation process to adapt to new emerging
paradigms. Gerontechnology; business cooperation; Information and Communication
Technologies (ICT) applied to agriculture, ICT applied to tourism; Digital Marketing
may be challenges to the rural regions’ problems.
The ICT assume a role that facilitates the constitution of cooperation networks,
making easier the development of alliances, allowing to create of virtual organizations
with other business partners and to develop inter-organizational information systems
that support strategic business relationships, with clients, suppliers, subcontractors and
others [6, 7]. This paper in the next chapters starts to make a demographic and
economical characterization of rural regions presenting the case of the Northeast of
Portugal. After, presents some major challenges and opportunities for rural regions that
can be faced through the use of ICT based solutions where are explain the role of ICT
for facing each vector. In the next chapter is presented a conceptual model of what can
and should be the vision of a smart rural regions. Finally, there are made some con-
clusions and final remarks.
Building Smart Rural Regions: Challenges and Opportunities 581
2 Rural Regions Characterization: The Northeast of Portugal
There isn’t a single definition of what a rural region is and, naturally, not all rural
regions are the same, each one with its own peculiarities. As a typical example of a
rural area, in this section we will make a brief characterization of Terras de Trás-os-
Montes (TTM). TTM is a region in the northeast of Portugal, bordering Spain,
aggregating nine municipalities though an area of about 5,538 km2.
2.1 Demographic Characterization of the TTM Region

According to 2011 Census, TTM region had 117,527 inhabitants [8], with an average
of 21.2 inhabitants per square kilometer, and having only two towns with more than
10.000 inhabitants, Bragança and Mirandela with 23,186 and 11,579 inhabitants,
respectively. In fact, more than half (53%) of the residents lived in places with less than
2,000 inhabitants [9]. Population decline has been a trend in recent years and the
estimation for 2018 was 107,860 inhabitants, a decreasing of 8.2% in 7 years.
In addition to that, the population aging is also noticeable. The aging rate was
243.9 elderly people per 100 young people, according to 2011 Census, and the esti-
mation to 2017 was 291.2, while the national average was 153.2 [8].
Due to the regions’ characteristics and the population being dispersed throughout
the region, it is difficult to them, particularly to the elder ones, to get to public services
and others, since they are far away and require some sort of transportation. Although
being aware of that situation, local and central government cannot provide support to
all the isolated and elder population and there are some serious cases where elder
population depend on their neighbors that are about the same age and facing the same
problems [10].
2.2 Economic Characterization of the TTM Region

The purchasing power per capita in the region in 2017 was 79,55%, one of the smallest
in the country, and the proportion of buying power was only 0,84% of the country, the
third smallest of all national regions [11].
In 2017, there were 19,013 companies in the region (1.5% of total national). As
expected in a rural area, most of the companies were from the primary sector, mainly
related with Agriculture, livestock, hunting, forest and fishing, representing more than
half (52.7%) of companies in the region, a very different reality than the national
numbers (10.7%). With a much smaller representativeness (12.5%, corresponding to
2,373) are wholesale and retail trade and repair of motor vehicles and motorcycles, the
second biggest activity in the region, followed by accommodation, catering and similar
(5.7%). Each of the remaining activity areas represents less than 5% of the total
companies, with only 602 (3.2%) companies of the industry sector, namely 586 (3.1%)
of transformative industries and 16 (0.1%) extractive industries [12].
Regarding the Gross Value Added (GVA), the tertiary sector (services) is the most
relevant [13]. The primary sector related activities in the region accounted for 7.2% of
GVA. This is more than three times the national GVA percentage (2.0%) for that
sector, but we must also remark that the sector represents a percentage of five times
more companies in the region than nationally, which leads us to believe that the sector
could provide more value, and that ICT could contribute to it.
The companies are majorly of small dimension in terms of number of employees:
about 98.6% had less than 10 employees and there were only one with more than 249
workers [8]. In 2011, about 98% had an average number of two employees [13].
Regarding the survival rate of the companies, from activity areas that can be
internationalized, we can observe that only 63.21% survive after two years of labour. In
2016, only 0.66% of all the new created companies referred to companies in sectors of
high and medium-high technology [12]. In fact, a national study about micro com-
panies (until 10 workers) showed that local entrepreneurs considered the district the
worst nationally to open a new business even though they think the district has a quite
favourable economic situation [14].
The TTM region is characterized by extended agricultural and forestry resources
having almost 38% as its extension used on agricultural purposes. As such, it is not a
surprise that the agro industrial sector is predominates in the region covering horti-
cultural, fruit and mycological. The livestock production is equally important in the
region’s economy [13].
To strength local economy and improve internationalization capability it is urgent
to invest in Research and Development (R&D), but in 2009 only less than 1.4 million
were invested in the TTM region (0.5% of national R&D expenses) and in 2016 the
investment in R&D reached 2,348 million euros [13, 15].
The same opinion is shared by regional experts sustaining that the region’s
development strategy cannot be supported only in the traditional sectors of the econ-
omy; this region has high concentration of production in activities with low added
value. The region must invest in the industrial sector, to promote economic growth, and
especially those based on innovation and technological and export capacity. The
reduced degree of use of information technologies by some segments of the population
is one factor that is damaging the competitiveness of the region. To do so, it is urgent to
increase the investment in industrial innovation processes and have entrepreneurs work
together and collaboratively. However, regional entrepreneurs reveal almost total
ignorance of the networking programs promoted by local organizations [14, 16, 17].
3 ICT Challenges and Opportunities for Rural Regions
The role of ICTs is crucial for the creation of smart rural regions and for them to
improve both their levels of competitiveness and their levels of citizenship and social
justice. It is therefore vital to understand how ICT can help to tackle the main problems
of rural regions and enhance the development and wealth creation so that only the
identity-specificity becomes associated to rural regions – and not just the vision of
depopulation area and inefficiency in terms of economic sustainability and develop-
ment. Next there are presented the key drivers of ICT intervention to respond to the
challenges/opportunities of rural regions to become smart rural regions in the future.
The analysis of rural regions and the definition of strategies to improve their main
vectors should, in our opinion, be perceived according to two main axes - the axis of
citizenship and social justice and the axis of economic growth and development. In
Fig. 1 is presented some main drivers for the creation of smart regions according to the
two referred main axes.
Fig. 1. Main drivers for the creation of rural regions
Rural regions face several challenges like an aging and isolated population in need
of adequate health care and better care by their caregivers (family or others); The same
fringe of the population concentrates a set of ancient knowledge of immaterial character
that must be perpetuated between generations to come. From a competitiveness and
wealth generation perspective, there is a marked dependence on agriculture; The
industrial fabric is uncompetitive and in isolation has a low intervention capacity;
Equally, these regions lack a greater capacity for promotion, enclosing in themselves an
important set of traditions, high quality products and unique natural beauty. From this
analysis emerge five processes that in our opinion should be worked on to create smart
rural regions and where ICT will be a key enabler and lever. These five processes are
therefore – cooperative networks, precision agriculture, digital marketing geron-
totechnology and elders’ immaterial heritage. Next, these five processes are succinctly
discussed.
3.1 Rural Cooperative Networks e-Platforms

The structure of the business system and the development profile of what is generally
called business have changed dramatically in recent decades. From a clearly vertical
structure, we move to a clearly transverse model. This new form of structuring reflects
to some extent a growing need for interoperability between organizations whose long-
term business perspective is no longer isolated.
The cooperation should be considered as strategic because it enables cooperating
organizations to develop joint strategies and, consequently, to obtain and maintain
competitive advantages. The purpose of cooperation is to obtain synergy effect, on the
outcome that the relationship represents more than the sum of its parts.
In the context of rural regions is essential to create cooperation networks that
favoring competitiveness and the complementarity of companies [18] allowing eco-
nomic agents offer a wider range of products and services to its customers [19]. This
reality shows, in our opinion, that any conceptual model to leverage sales and/or
promotion system must be based on a cooperative approach and not by isolated
individual organizations. Considering the context of business in rural regions, and
especially in disadvantaged regions, where small economic operators are dominant, it is
essential to create conditions and promote ways to increase their competitiveness. As
such, any proposed models to leverage sales and promote regions, must support/create
synergies among the small agents and promote cooperative arrangements, and conse-
quently, increasing the capacity of installed response.
3.2 From Agriculture to Precision Agriculture

Rural regions have a more expressively, when compared to urban regions, economy
dependent on agriculture. If in the past agriculture was almost absent from the focus of
the computational technological evolution, in the last decade the evolution of com-
putation applied to agriculture has developed incisively. From this evolution is born the
concept of Precision Agriculture (PA). PA is information intense [20] and it is tech-
nological based [21].
Getting technology to the fields is a challenge for rural areas. Technologically there
are several technologies that allow to monitor a series of environmental and productive
parameters. There are also a number of applications that allow you to work with
appropriate parameters through sensor networks to develop decision support tools.
However, one of the main hurdles is implementation costs which become particularly
difficult in regions where each farmer has small plots of land. In this reality the
dispersion of technological tools must be done using sectorial cooperation mechanisms.
This will reduce implementation and maintenance costs. Being in rural regions where
there are excellent food and gastronomic products, a bet should be made on increasing
production while maintaining productive quality. In this context, PA will be a funda-
mental ally for the development of the current reality of agriculture in rural regions.
3.3 Using Digital Marketing for Regional Promotion

The evolution of information technology has in recent years revolutionized the
strategies of communication, understanding and satisfaction of customers. It is there-
fore up to companies to take full advantage of a whole new reality and a whole new set
of technological tools at the service of competitiveness.
Regions also have a supreme need for promotion, regions all around the world
compete for visitants. In this domain, where marketing plays a major role, the use of
ICT is undoubtedly a catalyst for competitiveness and according [22] there is a clear
evidence that ICT sector is a positive driver to influence enterprise competitiveness and
in this case the regions competitiveness among their peers.
Consumer and visitor perceptions depend on the ability of regions to promote their
best and unique. Providing potential visitors with information about heritage, cuisine,
natural beauty and ancestral traditions is vital for rural regions to be able to assert
themselves positively in the tourism sector. As rural areas and their attractiveness
depend, often on a combination of various aspects and not just one or two ex-libris; It
will be crucial to create platforms for the promotion of rural areas more centrally and
under a “common hat”.
3.4 Using ICT to Promote Rural Regions Elders-Based Immaterial

Heritage
Rural regions concentrate a very rich set of ancestral traditions. The perpetuation of
such traditions has been achieved through transmission between generations. Unfor-
tunately, all this knowledge typically is elders-centered and it lack effective processes
of digitalization, storage and providing-systems for that all this heritage can effectively
be perpetuated through future generations that are digital-born [23]. Rural regions have
an enormous collection of ancestral knowledge that we are responsible to deliver to
future generations as an inheritance to which they are entitled.
In order to respond to the challenge to perpetuate ancient knowledge for future
generations who are digital natives, according to [23] we must create a repository of
immaterial heritage human-centered knowledge, implement effective support for pro-
motion and during-experiences tourists needs.
There is currently a search for experiences authentic and contextualized as a way of
escaping copies and banality [24, 25]. Tourists are looking for connections and
experiences that are rooted in destination [26]. Among the many characteristics that
generate authenticity for a destination.
In order to facilitate the creation of promotion and business generation services
based on material and intangible heritage, it is necessary to create platforms that allow
the digitization of this knowledge and its easy availability. Thus, the creation of a
digital platform that is a regional knowledge repository with mechanisms to make it
available as integrated services will be an opportunity to streamline the creation of
applications on different platforms and for different purposes (e.g. tourism, gamification
and territorial marketing).
3.5 Using Gerontechnology for Support Elders

Gerontechnology is as a multidisciplinary model that uses technology to innovate in the
geriatric field [27, 28]. The International Society of Gerontechnology (ISG) considers
that ‘‘gerontechnology creates solutions to extend the working phase in society by
maximizing the vital and productive years in the life span, consequently reducing costs
in later life’’. A recent report [29] demonstrates that in older adults, use of technology is
on the rise, some older adults remain isolated from digital life altogether.
Rural regions typically experience greater distancing from health and care services
dedicated to the elderly. Paradoxically, it is in rural areas that an important group of the
elderly population is found. Aggravating this, typically rural regions are sparse in the
younger population and it is often characteristic that younger members of a family are
living in urban centers and as such are distant from their older relatives. These facts
make it more difficult to democratize the access to health care services. It is also
essential that mechanisms can be created to “bridge the gap” between the elderly and
their families.
R&D efforts show that assistive information and communication technologies can
successfully contribute to all dimensions of elderly’s quality of life. Technologies can
empower them to control their health problems, compensate functional disabilities and
increase their safety [30] and also enable a closer and real time monitoring of several
health parameters that can help healthcare providers to make more timely and effective
decisions.
According to [31, 32] the “Use of gerontechnology seems a synthesis of person,
technology, and environment’’. This means that gerontechnology must be involved in
the full spectrum of human activities, encompassing health and behavior, activities of
daily living and accommodation, communication and autonomy, mobility and trans-
port, job and leisure (think of age-friendly cities and hospitals). Gerontechnology is an
interdisciplinary field combining gerontology and technology for the development of
these systems, that is, systems for health, housing, mobility, communication, leisure,
and work of the elderly. The creation of gerontotechnology platforms could greatly
help the aged and isolated population that characterizes rural regions. Approaching care
and follow-up services.
4 A Conceptual Model of a Smart Rural Region
Following an introduction to the key concepts that define a rural region and a demo-
graphic and economic characterization of an example of a rural region - TTM, and the
discussion of several main vectors for the creation of smart rural regions, we then
present a conceptual model that reflects our vision of a smart rural region. In Fig. 2 the
proposed conceptual model is presented, which brings together an integrated view of
four fundamental quadrants.
Fig. 2. Conceptual model of a smart rural region
The proposed model is based on an integrated perspective based on cooperation

models. For the reasons previously stated and discussed, rural regions are made up of
multiple small players. In this scenario the creation of cooperative networks has a vital
economic and management reason.
The model combines business cooperation networks, agricultural and food sector
cooperation networks, support networks for the elderly (which represents a huge
societal challenge) and, at the top, a process of regional digitization. This process
translates as a necessary initial step in laying the foundations for regional development.
Surveying all relevant regional information, cataloging it, and setting up agile mech-
anisms to disseminate and enable services interoperable with external processes, is a
prerequisite for a smart region. ICT plays a key role here and is a powerful transfor-
mational enabler of rural and intelligent regions that are able to assert themselves in a
highly competitive global context.
5 Conclusion and Final Remarks
The existence of rural regions is a constant in any country. However, its definition
varies from country to country. This paper introduced the concept of rural region by
characterizing it. The Portuguese case of TTM was used as an example of a rural region
that presents multiple challenges. An analysis of some of the main challenges was
made according to the vectors of citizenship and social justice and the vectors of
economic growth and development. This paper provides a discussion of the key
foundations that characterize the challenges and opportunities of rural regions and how
ICT can drive the transformation of rural and smart rural regions. The conceptual
model presented intends to be an integrative vision and based on a cooperation model
to support a moderate vision for rural regions. However, there is a full awareness that
implementing the proposed model is a huge challenge.
Acknowledgments. UNIAG, R&D unit funded by the FCT – Portuguese Foundation for the
Development of Science and Technology, Ministry of Science, Technology and Higher Edu-
cation. UIDB/04752/2020.
References
1. Simkova, E.: Strategic approaches to rural tourism and sustainable development of rural
areas. Agric. Econ.–Czech 53(6), 263–270 (2007)
2. Organization for Economic Co-operation and Development (OECD): Tourism Strategies and
rural development. OCDE/GD (94)49 Publications, Paris (1994)
3. Størup, J.: Mobility is about bringing people together. Technical report (2018)
4. Plazinic, B., Jovic, J.: Mobility and transport potential of elderly in differently accessible
rural areas. J. Transp. Geogr. 68, 169–180 (2018)
5. Budejovice, C.: Macroeconomic Effects on Development of Sparsely Populated Areas.
Interreg-Central.Eu (2017)
6. Mendonça, V., Varajão, J., Oliveira, P.: Cooperation networks in the tourism sector:
multiplication of business opportunities. Procedia Comput. Sci. 64, 1172–1181 (2015)
7. O’Brien, J.A., Marakas, G.: Management Information Systems. McGraw-Hill/Irwin,
New York (2010)
8. INE - Instituto Nacional de Estatística. https://www.ine.pt/. Accessed 21 Nov 2019
9. PORDATA - População residente em lugares com 10 mil e mais habitantes, segundo os Censos
(2015). https://www.pordata.pt/Municipios/Popula%C3%A7%C3%A3o+residente+em+luga
res+com+10+mil+e+mais+habitantes++segundo+os+Censos-26. Accessed 21 Nov 2019
10. Cunha, C.R., Mendonça, V., Morais, E.P., Fernandes, J.: Using pervasive and mobile
computation in the provision of gerontological care in rural areas. Procedia Comput. Sci.
138, 72–79 (2018). https://doi.org/10.1016/j.procs.2018.10.011
11. INE - Instituto Nacional de Estatística - Estudo sobre o Poder de Compra Concelhio. INE,
Lisboa (2019). ISBN 978-989-25-0501-5 (2017)
12. INE - Instituto Nacional de Estatística. Empresas (N.º) por Localização geográfica (NUTS -
2013) e Atividade económica (Subclasse - CAE Rev. 3); Anual - INE, Sistema de contas
integradas das empresas (2018). https://www.ine.pt/xportal/xmain?xpid=INE&xpgid=ine_
indicadores&indOcorrCod=0008466&contexto=bd&selTab=tab2. Accessed 21 Nov 2019
13. CIM-TTM Comunidade Intermunicipal Terras de Trás-os-Montes. Plano Estratégico de
desenvolvimento intermunicipal das Terras de Trás-os-Montes para o período 2014–2020
(2014). http://cim-ttm.pt/pages/482. Accessed 20 Nov 2019
14. Pereira, A.: Competitividade Regional para micro e pequenas empresas (on-line edition).
Diário de Trás-os-Montes, Montes de Notícias (2016). https://www.diariodetrasosmontes.
com/noticia/competitividade-regional-para-micro-e-pequenas-empresas
15. Jornal Económico. Portugal com a mais alta taxa de despesa em investigação e
desenvolvimento no ensino superior (on-line edition) (2017). https://jornaleconomico.sapo.
pt/noticias/portugal-com-a-mais-alta-taxa-de-despesa-em-investigacao-e-desenvolvimento-no-
ensino-superior-239948
16. CIMAT – Comunidade Intermunicipal Alto Tâmega. Fórum para o Desenvolvimento de
Trás-os-Montes e Alto Douro (2015). https://cimat.pt/forum-para-o-desenvolvimento-de-
tras-os-montes-e-alto-douro
17. CIM-TTM Comunidade Intermunicipal Terras de Trás-os-Montes. CIM das Terras de Trás-
os-Montes com projeto para implementação de marca territorial (2018). http://cim-ttm.pt/
pages/528?news_id=84
18. Gao, J.Z., Prakash, L., Jagatesan, R.: Understanding 2D-barcode technology and applica-
tions in m-commerce - design and implementation of a 2D barcode processing solution. In:
Proceedings of 31st Annual International on Computer Software and Applications
Conference, Beijing, China, pp. 49–56 (2007)
19. Hall, C.M., Mitchell, R.: Wine tourism in the mediterranean: a tool for restructuring and
development. Thunderbird Int. Bus. Rev. 42(4), 445–465 (2001)
20. Stafford, J.V.: Implementing precision agriculture in the 21st century. J. Agric. Eng. Res. 76,
267–275 (2000)
21. Cox, S.: Information technology: the global key to precision agriculture and sustainability.
Comput. Electron. Agric. 36(2–3), 93–111 (2002)
22. Čorejová, T., Madudová, E.: Trends of scale-up effects of ICT sector. Transp. Res. Procedia
40, 1002–1009 (2019). ISSN 2352-1465
23. Cunha, C.R., Carvalho, A., Afonso, L., Silva, D., Fernandes, P.O., Pires, L.C.M., Costa, C.,
Correia, R., Ramalhosa, E., Correia, A.I., Parafita, A.: Boosting cultural heritage in rural
communities through an ICT platform: the Viv@vó project. IBIMA Bus. Rev. 2019, 1–12
(2019). ISSN 1947-3788
24. Kolar, T., Zabkar, V.: A consumer-based model of authenticity: an oxymoron or the
foundation of cultural heritage marketing? Tour. Manag. 31(5), 652–664 (2010)
25. Nao, T.: Visitors’ evaluation of a historical district: the roles of authenticity and
manipulation. Tour. Hosp. Res. 5(1), 45–63 (2004)
26. Yeoman, I., Brass, D., McMahon-Beattie, U.: Current issue in tourism: the authentic tourist.
Tour. Manag. 28, 1128–1138 (2007)
27. Boyle, D.: Authenticity: brands, fakes, spin and the lust for real life. Harper Perennial,
London (2004)
28. Sheets, D.J., La Buda, D., Liebig, P.S.: Gerontechnology. The aging of rehabilitation. Rehab
Manag. 10, 100–102 (1997)
29. Graafmans, J., Taipale, V.: Gerontechnology. A sustainable investment in the future. Stud.
Health Technol. Inform. 48, 3–6 (1998)
30. Smith, A.: Pew Research Center. Older adults and technology use; 201. http://www.
pewinternet.org/2014/04/03/older-adults-and-technology-use
31. Siegel, C., Dorner, T.E.: Information technologies for active and assisted living—influences
to the quality of life of an ageing society. Int. J. Med. Inform. 100, 32–45 (2017). ISSN
1386-5056
32. Lam, J.C.Y., Lee, M.K.O.: Digital inclusiveness - longitudinal study of internet adoption by
older adults. J. Manag. Inf. Syst. 22, 177–206 (2006)
The Power of Digitalization: The Netflix Story
Manuel Au-Yong-Oliveira1,2(&), Miguel Marinheiro1,

and João A. Costa Tavares1
1
Department of Economics, Management, Industrial Engineering and Tourism,
University of Aveiro, 3810-193 Aveiro, Portugal
{mao,miguelacmarinheiro,joaoactavares}@ua.pt
2
GOVCOPP, Aveiro, Portugal
Abstract. The evolution of technology, and mainly the evolution of the

Internet, has improved the way business is done. Nowadays, most services are
offered through a website or through an app, as it is much more convenient and
suitable for the customer. This business transformation made it possible to get a
faster and cheaper service, and companies had to adapt to the change, in order to
fulfill customers’ requirements. In this context, this paper relates to this digital
transformation, focusing on a case study about Netflix, a former DVD rental
company and currently an online streaming leader. We aimed to understand
Netflix’s behavior alongside this digital wave. Thus, we performed a survey,
which had 74 answers, mainly from Portugal, but also from Spain, Belgium,
Italy, Turkey, Georgia and Malaysia. Of the people who answered the survey,
90.1% were stream consumers, but only 59.1% had premium TV channels.
From those 90.1%, 58.3% also said that they watched streams between two and
four times per week, but the majority of premium TV channel subscribers
(63.8%) replied that they watch TV less than twice in a week. We see a trend in
which the traditional TV industry is in decline and streaming as a service has
increased in popularity. Consumer habits are changing, and people are getting
used to the digitalization era. Netflix is also confirmed in our survey as the
market leader of the entertainment distribution business, as stated in the litera-
ture, and the biggest strength of this platform is its content.
Keywords: Netflix Internet Digital transformation DVD Online

streaming
1 Introduction
The Internet’s origin goes back to the 1960’s, with ARPANET in the U.S.A. [1]. Since
then, the Internet has evolved and has become more accessible to everyone across the
world. What started as a military tool has improved to what we have nowadays at
home. The world has become more connected and globalized. More than that, the
Internet became something so indispensable to ordinary life, that work, school and
even entertainment requires it.
The Power of Digitalization: The Netflix Story 591
Therefore, companies had to adapt and evolve with the Internet. Online business
started a new era of change and evolution. Besides that, customers’ needs also changed,
as accessibility and convenience led to a more self-indulged client, specifically in the
television market. The effects of the technology caused a digital transformation in the
entertainment distribution market, with online streaming revolutionizing the business
format. This data streaming can be explained as the ability of playing a multimedia file
without it being completely downloaded first.
Summing up, the evolution and accessibility of the Internet and the change in the
audience’s behavior, potentiated the creation of a new business model – a platform of
streaming. From the customer’s point of view, this platform allowed access to a variety
of content which was previously harder to get, and they are no longer subject to
television’s schedule and content. At the same time, the suppliers could cut costs in
physical inventory and physical stores and increase their network of customers.
Therefore, for both sides it is a win-win situation – a mutually beneficial way of doing
business.
In this context, Netflix earned a reputation and became world leader [2]. This article
focuses on their adaptation to the new era. As all of the co-authors are Netflix clients,
this provided additional motivation for the case-study. Netflix, a former DVD rental
company, recognized how to do business in order to keep up with the trends, being
pioneers of the digital transformation in the entertainment distribution market.
In the next section, this article presents a brief chronological explanation of Netflix,
followed by a literature review. Primary data, gathered in a survey, is also analyzed and
discussed, in order to understand how current and potential customers see this new type
of business, and in order to ascertain how the future of the entertainment market might
become in the future.
2 Netflix
Software engineers Reed Hastings and Marc Rudolph first founded Netflix in 1997, as
a regular DVD rental business. According to Hastings, the competition was charging
high fees for late returns, and he saw it as an opportunity of differentiation: creating a
more customer-friendly model [3].
In April 1998, Netflix developed their first game changer: DVD rental by e-mail.
Customers could select the movie they wanted online and have it delivered to their
door. By that time, VHS was dominating the market, so only 2% of the population
owned a DVD player back then. It was thus a risky strategy, but clearly shows how
innovative their model was since the very beginning. A year later, Netflix changed their
payment method to a subscription model, where customers could rent DVDs for a fixed
fee per month.
Figures 1 and 2 show how Netflix’s website has evolved.
592 M. Au-Yong-Oliveira et al.
Fig. 1. Netflix’s website in 2008 [4]
Fig. 2. Netflix’s website in 2019 [5]
In 2003, Netflix achieved one million subscribers. The DVD rental store kept their
subscription model until 2007. The co-founders understood that the company was not
growing anymore, and with the digital transformation affecting almost every company,
it was time to develop a new plan to cover customers’ demands.
“We never spent one minute trying to save the DVD business”, Ted Sarandos says,
Netflix’s head of content since early 2000 [3]. It was all about evolving and improving
the TV industry. Therefore, they began offering the option of streaming licensed
movies, and even a couple of TV shows. This new option quickly started to get known
and popular, thus improving the content library was now a priority.
With that in mind, Netflix reached an agreement with Starz Entertainment, in 2008.
In 2010, an agreement valued in one billion dollars was announced with Lion Gates
Entertainment, MGM and Paramount Pictures. In the same year, Netflix’s app was
launched for iOS.
Although having completely dominated the market, having good broadcasters and
producers was not enough. Based on IMDB ratings, number of views, customer
feedback and other parameters, Netflix developed an algorithm for a rating system that
could look through customers’ preferences, improving the recommendations model [6].
In 2013, based on that analysis of customer data, Netflix began producing their own
shows. House of Cards was their first of many originals, debuting on February first [7].
In 2016, this TV network was already available for 190 countries.
Now, in 2019, Netflix is a case study and an example for the competition, as their
combination of digitalization with content marketing completely reinvented the cable
era. According to Sarandos, “Pay television didn’t have a distribution problem – it had
a packaging problem and a content problem. We saw that a lot of [cable customers]
were paying for sports they didn’t want and channels they didn’t watch. There’s got to
be much more equilibrium between consumer demand and pricing. Through the growth
of all these direct-to-consumer services, television will become better and better.” [3].
Figure 3 shows a business model canvas for Netflix.
Fig. 3. Netflix’s business model Canvas [8]

3 Literature Review
3.1 Technological Development
The importance of innovation on competitiveness is well recognized. However, there is
less consensus about what enables an organization to innovate. Innovation networks are
a logical effect from the increasing complexity of innovative products and services [9].
In this context, technological development has been increasing in an exponential
way in years past, affecting the way industries and companies drive their businesses in
order to satisfy the way that customers want their products and services to be delivered.
Thus, the impact of digitalization on products and services cannot be depreciated.
Digital products are an interesting issue regarding digital value propositions. Many
products are today enabled by connectivity, so that these devices send data back to the
supplier [10].
Therefore, digitalization and evolution led to a new era and a new way of doing
business, and it matters to understand its impact on television and on the entertainment
market.
3.2 The Entertainment Market

The television industry is no exception to technological development and its changes,
and companies must find ways to deliver their services to customers, aligned with
technological innovations, and aiming to obtain competitive advantages.
The past decade has seen remarkable advances in the Internet as well as its fast
diffusion among diverse consumers. This growth seems to have significantly changed
the consumption of existing entertainment goods, as the Internet provides consumers
with a wealth of information on entertainment goods and increasingly often, digital
versions of entertainment goods themselves. The direction of this change is ambiguous,
however, because more information from the Internet could increase or decrease
consumption. Moreover, the Internet might even become a competitor to existing
entertainment goods [11].
According to [12] the TV industry has evolved into a multi-sided market in recent
years, with distribution platforms increasingly occupying a central position in the
market. [12] also says that distributors changed their business models, starting to play a
multi-sided role, liaising with third-party content providers, advertisers and viewers.
Analyzing the Netflix case, this was the game changer: the company took a huge
advantage of new technologies, particularly online streaming, in order to create a
competitive advantage, while occupying a central position in the market and estab-
lishing relations with three different types of entities: content providers, advertisers and
viewers.
3.3 Netflix
The major motivation for this change driven by Netflix was the change in consumer
habits, led by the role that the Internet has in people’s lives nowadays. [13] used Pardo
and Johnson’s points of view to describe the role that consumers had in the change of
distribution market. Although these authors have different perspectives about the new
consumer trends, they both agree that technology has had a significant impact on
television consumption. According to [14] there are two types of emerging consumers:
“cord nevers”, who have never subscribed to traditional multichannel video pro-
gramming, opting instead for Internet streaming options, and “cord cutters”, consumers
who previously paid for cable or satellite television, but have decided to stop sub-
scribing. On the other hand, [15] looks to the consumers’ ability to keep up with
technological devices. He relates the digitalization of entertainment with the expansion
of the “Apple ecosystem”. “This iPod/iPhone/iPad generation epitomizes the new peer
group of users whose audiovisual experience is based on all sorts of media platforms
and whose profile to a large extent mirrors that of the cinema-going public and those
who play video games.” [15].
With the Internet’s availability to consumers and the common use of smart devices,
these new technologically driven users interact with entertainment in a more practical
and efficient way [13]. The author adds that “with the prevalence of cord cutters, cord
nevers, and a generation of Apple users, Johnson and Pardo view changes in distri-
butions as a response to growing demand for digital platforms for online television
viewing”.
Succinctly, Netflix was always one step ahead in understanding all these market
changes and providing to the new technological generation the services they want to
pay for. With a high understanding of the market needs and the new way of making
their business better than the competitors, the company was capable of climbing several
positions, establishing itself on the throne of this kind of market.
4 Methodology
One of the purposes of this article was to identify how much moving along with tech-
nological advances and understanding market changes benefits a company, using Net-
flix’s case to do so. In order to fulfil that, a survey was created – using Google forms – as a
research methodology to collect quantitative data and gain in-depth information about
people’s habits, preferences and opinions on the subject. The survey gathered data from a
convenience sample (an accessible group of individuals, which is readily available to the
researchers) [16]. Convenience samples are not ideal, however they are good for
exploratory research, and as a basis for further research to be performed. Convenience
samples are also good for establishing links to existing related research. This type of
sample, in business and management research, is very common [16].
A timeline of three weeks was defined to collect the necessary data for the research,
and the form was shared mainly in social media (such as Facebook and Twitter), but
also in forum websites (such as Reddit – an international platform for sharing ideas and
content). Erasmus [Facebook and WhatsApp] groups were also a target, so that the
form could reach a wider geographic area, as two of the authors had Erasmus contacts
due to recent Erasmus experiences that they had had.
There was not a specific target audience, and it was deemed important to acquire
information from different age groups, professions and even nationalities. Thus, the
survey gathered 74 answers, mainly from Portugal, but also from Spain, Belgium, Italy,
Turkey, Georgia and Malaysia. As Netflix has customers in over 180 countries, it was
important to reach a wide demographic and geographic area.
Secondary data was also analyzed to reach the paper’s aim.
5 Discussion
An objective of the research was to get survey answers from different segments (dif-
ferent ages, professions and nationalities, among others), however, the respondents
were mainly millennials, like two of the authors. The data collected is mainly quan-
titative and only the last survey question was qualitative. A total of 72% of those who
answered the survey are between 18 and 25 years old, as this is the generation that is
mostly evolving with technological change.
Of the people who have answered the survey, 90.1% were stream consumers, but
only 59.1% had premium TV channels (paid channels, such as Sport TV, which are
paid for separately). From those 90.1%, 58.3% also said that they watched streams
between two and four times per week, but the majority of premium TV channel
subscribers (63.8%) replied that they watch TV less than twice in a week. That clearly
goes according to the new era: consumers habits are changing, and people are getting
used to the digitalization era. A couple of years back, the percentages would probably
be swapped. The streaming business is getting popular; opposed to the TV industry,
that is declining.
When we say stream consumers, we are talking about multiple platforms of
streaming: Netflix, HBO, Hulu, Amazon, among others. However, between all those
platforms, 77.2% are Netflix subscribers. Therefore, if this sample can be taken as a
general example, this survey mostly proves one thing: Netflix is the leader of the
entertainment distribution business, being superior to the competition by a wide dif-
ference. This is also confirmed by the literature [2].
But why? Why is Netflix so superior to their competitors? What makes them the
very best?
According to the respondents, the biggest strength of this platform is content
(54.5%). Price, distribution and the possibility of streaming in multiple devices was
also referred, but apparently content is what sets them apart from the competition, as
81.8% places Netflix Originals as the best producer among all of them.
As for the role of digitalization in ordinary life, 77.3% of the respondents have
classified its impact with high/very high importance, showing the dependence on new
technologies for this group of people.
6 Conclusion and Future Research
The TV entertainment distribution market has changed alongside digitalization. Due to

the fast evolution of technology, companies had to study the market changes, predict
new opportunities and constantly adapt to new habits. This research has shown that
Netflix is the perfect example of this. The way they studied their market, the
competition and consumers, and the way they used it to earn competitive advantage, is
beyond inspirational. Netflix has reached, probably, their full potential already, which
is truly remarkable.
With our Google forms survey, it was possible to understand that the brand is
already stronger than the competition: people look at it as quality content and
distribution.
As shown before, 72% of those who answered the survey are between 18 and 25
years old, mostly students. Furthermore, for future research, this might be the critical
segment to analyze, as it is the generation to hold as customers.
Additionally, as content seems to be Netflix’s biggest strength, it might create
doubt as to where improvements should be channeled: should the next big strategy
focus on an ongoing development of original content, or should other aspects (such as
price and distribution) be improved upon? What about the possibility of freelancers
getting more attention from Netflix, since Black Mirror and Love, Death and Robots
became huge hits?
Besides the acknowledgment of strategy, one of the proposed goals for this article
was to possibly predict the next acquisition for the platform. Between the several
answers to the survey, some stood out, sometimes due to their repetition, and some-
times for their innovation. Thus, to the question “In your opinion, what is the next step
for streaming platforms?”, a few answers merit particular highlight: the acquisition of
rights to stream live sports; stream e-sports such as CS:GO, League of Legends and
Fortnite; Virtual Reality; or having the streaming services in a double package with
Internet servers. In fact, for future research, these are the themes that point to a whole
new market, so new competitors and markets must be studied in depth.
From the data collected, only 63.8% have premium TV channels. From that per-
centage, it is inevitable that a few only have it so they can watch sports. With the
possible acquisition of sports streams by Netflix, that percentage would probably be
even lower, and it would still improve their biggest strength: content. More variety
means more target audience, and sports have a tremendous impact on European
entertainment. People could save on cable TV and get a better service with the
platform.
The same goes for the acquisition of e-sports streams. In a world where gaming sets
a new trend, especially among children, competing against Twitch (which dominates e-
sports streaming) could set a new branch of customers for Netflix, and still fortify
content.
Virtual Reality could also be the next big step in terms of innovation of content.
The movie Black Mirror: Bandersnatch was immediately a huge hit, as it was a pioneer
in terms of interaction between the watcher and the movie itself. In this movie, viewers
can step in the story and pick different ends to it. Virtual Reality would be the perfect
development of it. As a matter of fact, there are already a few Netflix experiences with
VR and it should not take much longer to become regularly used [17].
Finally, as hard as it could be, having the streaming services along with Internet
servers would be a game changer. Logically, one cannot be used without the other, so it
would slightly improve accessibility. Some Internet providers already offer a free trial
on streaming platforms (Vodafone-HBO for instance), so a deal between both could be
reached.
Although there are a lot of question marks as to where the next step by Netflix will
take them, Reed Hastings – Netflix’s CEO – remains optimistic, and even enthusiastic.
When asked about the upcoming new competitors on streaming platforms, such as
Apple and Disney, he gently said: “these are amazing, large, well-funded companies
with very significant efforts, but you do your best job when you have great
competitors”.
7 Final Words – Netflix and Rural Areas
The Netflix service, focused on herein, has been labeled as an “inexpensive legal
streaming service” [18] which has, in fact, due to its low cost, lowered movie piracy.
Due to its low cost, Netflix should have an additional appeal, in rural areas. Rural areas
are, besides being more isolated, generally poorer, experiencing lower incomes [19].
However, for services such as Netflix to become popular, even in rural areas, fast
Internet connections are necessary, and may not always be available in certain regions.
Additionally, rural areas are stages to aging populations, relevant to the extent that the
elderly are less tech savvy [20, 21]. Connecting to a streaming service may, therefore,
be a problem. We thus suggest, for future research, that the effect and popularity of
Netflix in rural areas be studied. Netflix, and similar streaming services, may contribute
to a diminishing of the exodus of rural areas [19] and provide for an important con-
nection between the local population, as well as with family and friends who have since
moved to live outside such regions. Does the existence of low-cost streaming services,
such as Netflix, have a positive impact on the satisfaction and happiness of resident
rural populations?
References
1. Leiner, B., Cerf, V., Clark, D., Kahn, R., Kleinrock, L., Lynch, D., Postel, J., Roberts, L.G.,
Wolff, S.: Brief History of the Internet—Internet Society (2009)
2. Investopedia. https://www.investopedia.com/articles/personal-finance/121714/hulu-netflix-
and-amazon-instant-video-comparison.asp. Accessed 03 Dec 2019
3. Littleton, C., Roettgers, J.: How Netflix Went From DVD Distributor to Media Giant (2018).
https://variety.com/2018/digital/news/netflix-streaming-dvds-original-programming-1202910
483/. Accessed 31 Oct 2019
4. Business Insider. https://www.businessinsider.com/how-netflix-has-looked-over-the-years-
2016-4#in-2010-streaming-begins-to-be-more-than-an-add-on-and-gets-prominent-real-estate-
on-the-home-page-5. Accessed 03 Dec 2019
5. Netflix. https://www.netflix.com/browse. Accessed 03 Dec 2019
6. Oomen, M.: Netflix: How a DVD rental company changed the way we spend our free time
(2019). Business Models Inc. https://www.businessmodelsinc.com/exponential-business-
model/netflix/. Accessed 31 Oct 2019
7. Venkatraman, N.V.: Netflix: A Case of Transformation for the Digital Future (2017). https://
medium.com/@nvenkatraman/netflix-a-case-of-transformation-for-the-digital-future-4ef612c
8d8b. Accessed 31 Oct 2019
8. BMI - Business Models Inc. https://www.businessmodelsinc.com/exponential-business-mod

el/netflix/. Accessed 03 Dec 2019
9. Calia, R.C., Guerrini, F.M., Moura, G.L.: Innovation networks: from technological
development to business model reconfiguration. Technovation 27(8), 426–432 (2007)
10. Ritter, T., Lund, C.: Digitization capability and the digitalization of business models in
business-to-business firms: past, present, and future. Ind. Mark. Manag. (November), 1–11
(2019)
11. Hong, S.H.: The recent growth of the internet and changes in household-level demand for
entertainment. Inf. Econ. Policy 19(3–4), 304–318 (2007)
12. Evens, T.: Clash of TV platforms: how broadcasters and distributors build platform
leadership. In: 25th European Regional Conference of the International Telecommunications
Society (ITS), Brussels, Belgium, 22–25 June 2014. ECONSTOR (2014)
13. Aliloupour, N.P.: Impact of technology on the entertainment distribution market: the effects
of Netflix and Hulu on cable revenue. Open access senior thesis. Bachelor of Arts.
Claremont Graduate University (2015)
14. Johnson, C.M.: Cutting the cord: leveling the playing field for virtual cable companies. Law
School Student Scholarship, Paper 497 (2014)
15. Pardo, A.: Digital hollywood: how internet and social media are changing the movie
business. In: Friedrichsen, M., Muhl-Benninhaus, W. (eds.) Handbook of Social Media
Management, pp. 329–348 (2013)
16. Bryman, A., Bell, E.: Business Research Methods, 4th edn. Oxford University Press, Oxford
(2015)
17. Alvarez, E.: Netflix is taking a wait-and-see approach to virtual reality (2018). Engadget.
https://www.engadget.com/2018/03/07/netflix-virtual-reality-not-a-priority/. Accessed 31
Oct 2019
18. Nhan, J., Bowen, K., Bartula, A.: A comparison of a public and private university of the
effects of low-cost streaming services and income on movie piracy. Technol. Soc. 60,
101213 (2020)
19. Comissão Europeia - Portugal – A PAC no seu país. https://ec.europa.eu/info/sites/info/files/
food-farming-fisheries/by_country/documents/cap-in-your-country-pt_pt.pdf. Accessed 20
Jan 2020
20. Gonçalves, R., Oliveira, M.A.: Interacting with technology in an ever more complex world:
designing for an all-inclusive society. In: Wagner, C.G. (ed.) Strategies and Technologies for
a Sustainable Future, pp. 257–268. World Future Society, Boston (2010)
21. Fontoura, A., Fonseca, F., Piñuel, M.D.M., Canelas, M.J., Gonçalves, R., Au-Yong-Oliveira,
M.: What is the effect of new technologies on people with ages between 45 and 75? In:
Rocha, Á., et al. (eds.) New Knowledge in Information Systems and Technologies,
WorldCist 2019, La Toja Island, Spain, 16–19 April. Advances in Intelligent Systems and
Computing (Book of the AISC Series), vol. 932, pp. 402–414. Springer (2019)
An Online Sales System to Be Managed
by People with Mental Illness
Alicia García-Holgado(&) , Samuel Marcos-Pablos ,

and Francisco J. García-Peñalvo
GRIAL Research Group, Computer Sciences Department, Research Institute

for Educational Sciences, University of Salamanca, Salamanca, Spain
{aliciagh,samuelmp,fgarcia}@usal.es
Abstract. The percentage of the population aged 65 and over is increasing

during the last decades. It is one of the problems that the European health
system, and in particular the Spanish system, has to face out. This increase is
linked to the rising of dependent people, whose suffer progressive deterioration
of both their physical and mental capacities. In this context, technology plays a
key role in improving the quality of life, not only of older people but also their
caregivers. A technological ecosystem to support patients with mental illness,
their caregivers, and the connection with their relatives was developed in pre-
vious works. This solution is prepared to evolve according to the users’ and
organization’s needs. In this sense, the present work describes the inclusion of a
new software tool, an online sales platform that promotes active ageing, seeking
that it can be used and managed by older people who may have cognitive
impairment problems. Although there are many e-commerce platforms on the
market, they not consider users with special needs. The objective has not been to
develop a software prototype from scratch, but to focus on aspects relating to
accessibility and usability to improve online stores and apply these improve-
ments to an existing solution, following the philosophy of Open Source software
development. This work aims to describe the definition process itself.
Keywords: Web accessibility Technological ecosystem E-commerce

Rural areas Mental health Heuristic Cognitive impairment
1 Introduction
Today it is a fact that Spain, like the rest of the European Union (EU), is ageing.
According to data from the National Statistics Institute of Spain (INE), Spain registered
a new ageing historical maximum in 2018, continuing with the ascending trend of the
last decade. The percentage of the population aged 65 and over, which currently stands
at 19.2% of the total population, is expected to rise up to 25.2% in 2033. In this sense,
and if current trends continue, the dependency ratio (quotient, as a percentage, between
the population aged under 16 or over 64 and the population aged 16 to 64) would rise
from 54.2% today to 62.4% in 2033.
Given this reality, active ageing policies in Spain have received special attention in
the last decade. Active ageing is a concept defined by the World Health Organization
An Online Sales System to Be Managed by People with Mental Illness 601
(WHO) [1, 2] as the process of optimizing opportunities for health, participation and
safety in order to improve the quality of life as people age. In particular, in order to
promote the active ageing in an environment in which the penetration of technology in
different areas of life is already a reality, solutions must be proposed that allow the
active participation of the older citizens in the Digital Society. In this sense, different
initiatives have been promoted, like the 2011 European Agenda for Adult Learning
(EAAL), which defines the focus of European cooperation on adult education policies
for the period up to 2020, the Active Assisted Living Programme (AAL), or the
Interuniversity Programme of Experience running since 2002-2003 in the Autonomous
Region of Castile and León (Spain) [3]. The main goal is to promote ways for the
senior population to acquire new job skills within the Digital Society, promoting an
active lifestyle and avoiding social exclusion. However, an important aspect of older
people as they age is the progressive deterioration of both their physical and mental
capacities, which can make it difficult for them to use technological solutions [4].
With this in mind, a technological ecosystem [5] has been developed with two
fundamental objectives. First, improving the quality of life of (in)formal caregivers
through learning, ubiquitous access to information and support. Second, providing a set
of services for relatives and patients with a particular focus on those who live in rural
areas. Thus, the ecosystem integrates different software components with the aim of
improving the welfare work based on three pillars: teaching-learning, to provide the
necessary and specific training to (in)formal caregivers to provide care to the elderly;
social, with the aim of sharing experiences on the process of learning and welfare work,
also providing means to avoid the social exclusion of caregivers; and finally a dash-
board for ecosystem management and obtaining metrics that can be used for monitoring
and proposing new actions both at the welfare level and for the management of the
ecosystem itself. On the other hand, and given the inherent evolutionary approach of
technological ecosystems that must allow the incorporation of new components [5, 6],
an online sales platform has been developed that promotes active ageing, seeking that it
can be used and managed by older people who may have cognitive impairment
problems (as well as other people with other mental illness).
In this paper, the development of the prototype of an online store that is inclusive,
so that it takes into account all types of users and can also be managed by people with
different abilities, particularly people with severe and prolonged mental illness, is
presented. It has to be taken into account that, in the context of mental diseases, each
patient has a unique clinical picture [7], so it is very difficult for two people to share the
same symptoms or the same reactions to similar situations, so special care must be
taken in terms of the user experience (accessibility, usability, etc.). Although there are
many e-commerce platforms on the market (Etsy, Shopify, Bigcartel, Amazon, etc.),
they focus on improving the usability of the system but do not consider users with
special needs [8]. People with severe and prolonged mental illness must be able to
manage sales or make purchases over the Internet. The aim has been to develop a
system that allows a simplified sales network adapted to both workers and customers.
To do this, the objective has not been to develop a software prototype from scratch, but
to focus on aspects relating to accessibility and usability to improve online stores and
apply these improvements to an existing solution, following the philosophy of Open
Source software development.
602 A. García-Holgado et al.
The rest of the paper is organized as follows. Section 2 provides an overview of the
ecosystem. Section 3 describes the online sales platform. Section 4 presents the results
of the heuristic evaluation of the interface for consumers. Finally, Sect. 5 summarizes
the main conclusions of this work.
2 Ecosystem Overview
Technological ecosystems are solutions to support knowledge management in different

contexts. Although in the literature are similar concepts, such as software ecosystem [9]
or digital ecosystem [10, 11], these approaches are mainly based on a central platform
to connect or develop other tools and the actors connected with the ecosystem are
developers or stakeholders involved in the developing process. On the other hand, the
technological ecosystem approach proposes a decentralized solution in which different
software tools, human resources and information flows compose the ecosystem. Fur-
thermore, the technological ecosystem is focused on supporting evolution in order to
adapt continuously to the needs of the organization.
In this context, a technological ecosystem for (in)formal caregivers was developed
with the support of the DEFINES project, “A Digital Ecosystem Framework for an
Interoperable NEtwork-based Society” (Ref. TIN2016-80172-R), funding by the
Spanish Ministry of Economy and Competitiveness; and the TE-CUIDA project,
“Technology ecosystem to support caregivers” (Ref. SA061P17), funding by the
Ministry of Education of the Regional Government of Castile and León.
The ecosystem aims to develop and enhance the competences of the caregivers,
both formal and informal, as well as mitigate the negative effects produced by care-
giving activities such as overload, depression, or anxiety. Caregivers prevent the
institutionalization of the dependent persons enabling them to stay at home and thus
reducing care costs, but this has an impact on the wellbeing of caregivers, reducing
their quality of life and increasing social isolation and family stress. Also, being an
online solution allows access to services ubiquitously, facilitating resources such as
psychoeducation, or communication with relatives to those who live in rural areas.
Any technological ecosystem is composed by three main elements [6]: software
tools to provide different services within the ecosystem; human resources to ensure the
evolution of the ecosystem and being directly involved in the knowledge management
processes; and information flows to provide the interaction among tools, and between
tools and human resources. These elements are represented in an architectural pattern
and a metamodel to support the definition of technological ecosystems based on pre-
vious experiences. Both proposals, the architecture pattern and the metamodel, are the
main results of the thesis dissertation [12].
The definition of the technological ecosystem for caregivers is based on those
proposals. In this work, the architectural pattern is used to describe the main elements
of the ecosystem (Fig. 1). The ecosystem is organized in four layers - presentation,
services, static management data and infrastructure - and two input streams that
introduce the human factor as another element of the technological ecosystem.
The top layer, presentation, is focused on ensuring the usability of the software
components with a particular focus on user experience. Although usually this layer also
provides a uniform interface to all the components, in this proposal, the branding
associated with the ecosystem has not yet been fully implemented.
The second layer provides the software components with the main user-level ser-
vices. Initially, the ecosystem consisted of three components but incorporated a fourth
component, the online store. The first service is an online platform to provide a set of
private and safe areas for patients, relatives and caregivers, so they can maintain the
contact despite the living places or other socioeconomic situations [13]. Walls are
managed by (in)formal caregivers, care managers, but also the patients and their rel-
atives may be granted access to the social network.
Fig. 1. The architecture of the technological ecosystem for (in)formal caregivers. Based on [14]
The second service is focused on psychoeducation [15, 16] for (in)formal care-
givers in order to provide them with training support to cover different knowledge
needs, but also information, advice, and guidance, as well as access to a community of
equals and experts. The third service is a dashboard to support decision-making pro-
cesses. The knowledge managed in the ecosystem is associated with different sources
such as the information stored in the different tools of the ecosystem, the implicit and
tacit knowledge of the users, as well as the interaction of them and the ecosystem. The
dashboard combines these sources through data visualization. Finally, the fourth ser-
vice is the online store described in this proposal to simplify the sales network adapted
to both workers and customers with special needs.
Regarding the static data management layer, it provides tools to centralize infor-
mation needed by other components of the ecosystem. This layer has a database to
store data associated with the patients: caregiving activities, provided treatments, etc.
The last layer is the infrastructure; it provides a set of services that are used by the
software components from other layers. In particular, the mail server, the user man-
agement tool based on CAS (Central Authentication Service) and a tool to support data
analysis as a service for the dashboard.
Finally, the human factor is represented in the architecture through two input flows:
the business plan from a management point of view, and the training plan and medical
protocol from a methodological perspective. Furthermore, those as sensible and
medical data may be generated within the ecosystem component; the human factor
should take into account the ethics and data protection necessary to warranty safe data
governance [14].
3 Online Sales System
The online sales platform was planned to be integrated with the activities of the Special
Employment Centre in Zamora (Spain) (in Spanish, Centro Especial de Empleo, CEE).
The origins of the CEE are in the needs detected by professionals and associations
dedicated to the rehabilitation and socio-labour reinsertion of people with disabilities
due to serious and prolonged mental illness. The principal activity of the CEE consists
of the creation of spaces that allow the labour integration of people with serious and
prolonged mental illness, such as the design, production and marketing of products and
services in which people with disabilities participate, promoting and encouraging their
training and employment. Among the activities of the CEE in Zamora is the cultivation
and marketing of organic fruit and vegetables in rural areas of Castile and León, the
development of craft products with various materials, as well as cleaning and catering
services.
The online sales platform aims to support the distribution of the products created in
the activities associated with the CEE, so the people with mental illness will be
involved not only in the production phase but also in the sales phase. Furthermore, the
target audience is all types of users, but with particular emphasis on those who have
some mental illness or disability.
The online sales system has been developed within the province of the techno-
logical ecosystem presented in Sect. 2. For that purpose, a similar approach as the one
followed in [5], in which the authors showed the importance of modelling the business
structure along with the software structure during the early stages of the ecosystem
development. The main objective is not to follow a “business first” approach, but to
develop the business and software structures altogether, as they are complementary.
Taking into account the different business processes while developing the software
structure, provides fundamental “constraints” or characteristics for the software
structure, such as the data taxonomy and ontology, the data architecture, and the data
security and lifecycle.
3.1 Business Structure Model

As stated before, the development of the business model of the online store has started
with the modelling of the business structure. To do so, Business Process Modeling
Notation (BPMN) has been employed in its 2.0 version. The BPMN is a standard for
business process modeling flow chart notation that models the steps of a planned business
process from end to end. As such, a set of diagrams that describe the different tasks
involved in the online store business processes have been obtained. They start from a
high-level process conception using BPMN’s Choreography diagrams that show the
interaction between participants (described as senders and receivers) concentrating on the
message flow. From there, more detailed Collaboration diagrams were developed that
focus on the different tasks of the business to be coped with by the different participants to
achieve a particular goal. Finally, lower-level business processes describing the different
tasks that must be developed by the software components can be obtained. Figures 2, 3
and 4 show example instances of the modelled processes of the business, including a
higher level choreography diagram for placing an order, the Collaboration diagram
between the customer and the online sales system that describes the process of purchasing
a product, and the Process diagram of the login task into the sales system.
Fig. 2. BPMN 2.0 High-level choreography diagram for placing an order.
Fig. 3. BPMN 2.0 Process diagram of the login task in the sales system.
Fig. 4. BPMN 2.0 Collaboration diagram between the customer and the online sales system that
describes the process of purchasing a product.
3.2 Software Architecture

The online store was intended to be integrated with the business management software,
which was already in use by the CEE. In particular, the ERP (Enterprise Resource
Planning) software used by the CEE was Navision [17]. This integration aims to
facilitate the management processes associated with the store, so people with prolonged
mental illness will be able to manage sales.
Based on the developed business process models, a study has been carried out of
the functional and non-functional requirements of the software prototype of the online
store, as well as the requirements and services necessary for its integration with the
accounting and warehouse management system managed by the ERP. The online store
is based on Spree Commerce, an Open source e-commerce framework developed in
Ruby on Rails, which was adapted to the identified needs through the development of
extensions. In particular, the integration of the online store with the ERP is done
through web services under the SOAP standard, following the specifications provided
by the ERP.
The functional requirements included a list and description of the different contents
necessary for the development of the web services of the sales platform and their
integration with the ERP: user roles (administrator, manager, client, anonymous user);
data necessary for the client to register on the web (email, password, client key in
Navision, etc.); product information (name, photograph, base price, description, etc.);
order execution (payment method, shipping and collection method, customer number,
billing data, etc.); tracking (order date, link to tracking, etc.); and others. A description
of the different interfaces necessary for the development of the web services required to
integrate these data into the ERP was also developed. On the other hand, considerations
on non-functional requirements were also taken into account: integration with existing
solutions, usability and internationalization.
Likewise, a document has been developed that describes, from a technical point of
view, the different scenarios of the online sales system as well as its integration with the
ERP. These scenarios have been subdivided into the following sections: (1) description
of the scenario; (2) procedure (different steps that take place in the scenario); (3) UML
sequence diagram that implements the procedure; (4) data set needed to carry out each
of the ERP transactions required for each scenario.
The considered scenarios were:
• Product creation/modification: assign extra data to exist products and add them to
the store or create new products (not existing products in the ERP), as well as delete
products.
• Stock visualization: evaluate stock each time a customer visits a product page or
tries to make a purchase.
• Sales: record sales in the ERP each time a customer buys something through the
web.
• Sales Cancellation: Allow the customer to cancel a purchase (as long as the ship-
ment has not been made) and reflect the changes in the ERP.
• From draft sale to historical sale: sales are created as drafts so that the customer can
cancel them if desired. After the package has been shipped, these sales must be
converted to historical.
Figure 5 shows an example of the sequence diagram for adding a new product
within the product creation/modification scenario.
The next stage consisted of the visual design of the online store focused on pro-
viding a satisfying user experience, with special emphasis on users with severe and
prolonged mental illness. To this end, and based on the results of the previous activ-
ities, the different screens associated with the different scenarios and components
needed to meet the functional requirements have been defined, following the standards
of the World Wide Web Consortium (W3C). More specifically, the store design takes
into account the Web Content Accessibility Guidelines (WCAG) 2.0 standard for users
with some cognitive issues [18], as well as the recommendations for the cognitive
accessibility of web content available at the Web Accessibility Initiative (WAI) [19].
Although not all of the WCAG 2.0 criteria were applied, the WAI recommends
meeting at least the WCAG 2.0 Level A and AA criteria, along with some Level AAA
criteria that are particularly important for people with cognitive difficulties.
Finally, the connectivity of the online store with the ERP has been deployed, from
which an alpha version of the sales platform has been obtained. The final prototype is
available at http://dueroland.grial.eu.
Fig. 5. UML sequence diagram for adding a product.
4 Heuristic Evaluation
Finally, the beta version of the platform has been developed, including in its design the
aspects of usability and accessibility previously identified and proceeding to its usability
study in two ways: through heuristic tests carried out by experts in usability, and through
tests with the platform workers that are planned to be conducted in the future.
The first part of the usability study, the heuristic evaluation, was focused on the
interface for clients. It was carried out by two experts, two men between 25 and 38
years old. None of the experts had used the online store previously. One expert was
involved in the projects that support the definition and development of the system, and
the other one was directly involved in the development. None of the experts has
cognitive impairments or mental illness, but they have knowledge related to technology
apply to mental health. In addition to these characteristics, the criteria used to select the
experts was based on their professional profiles:
• E1: A full stack developer with six years of experience developing online platforms
for health and wellbeing sectors.
• E2: A researcher with more than ten years of experience in multimodal human-
computer interaction.
Each expert reviewed the online store. They identified the usability problems
associated with each heuristic proposed by Nielsen [20] and assigned a value from 1
(major usability problems) to 10 (no usability problems). Table 1 summarizes the
values for each heuristic rule. The average of each heuristic was calculated in order to
get a final value for each heuristic, so this value reflects where are more usability issues.
Table 1. Assigned values to each heuristic by each expert

Heuristic rule E1 E2 Avg.
HR1: Visibility of system status 10 7 8.5
HR2: Match between system and the real world 10 10 10
HR3: User control and freedom 10 7 8.5
HR4: Consistency and standards 10 5 7.5
HR5: Error prevention 9 8 8.5
HR6: Recognition rather than recall 10 7 8.5
HR7: Flexibility and efficiency of use 10 7 8.5
HR8: Aesthetic and minimalist design 7 7 7
HR9: Help users recognize, diagnose, and recover from errors 8 7 7.5
HR10: Help and documentation 4 6 5
5 Discussion and Conclusions
Active ageing is one of the main objectives of the current society. The population aged
over 65 years old has increased during the last decades, and the figures will continue
rising up. This situation supposes a challenge for the health systems across the world.
In this context, active ageing is one of the main objectives of the World Health
Organization in order to improve the quality of life as people age. This approach is
combined with technology to provide solutions that allow active participation of older
citizens in the Digital Society.
The current proposal aims to provide a set of guidelines to develop an inclusive
online store, so that it takes into account all types of users and can also be managed by
people with different abilities, particularly people with severe and prolonged mental
illness, is presented. The result is a prototype of an online sales platform in which
production and sales are carried out by people with mental illness.
It is important to emphasize the context in which the prototype of the online store
has been deployed. Although it is an online solution, accessible from any region in the
world, it is specially focused on promoting the distribution of products among the
different rural areas in Zamora, in the first instance, and subsequently at the regional
level, in Castile and León, and the national level, Spain. Currently, the prototype is
operative and contains products made in the workshops and activities of the CEE, with
home delivery service available for the Zamora region, including the entire
metropolitan area and rural areas of the province.
The tool was correctly integrated with the ERP to manage the sales, but also it was
correctly added to the technological ecosystem for (in)formal caregivers due to the
architecture proposed in Fig. 1.
Regarding the heuristic evaluation, it is important to take into account the bias of
this procedure. According to [21] the perception of evaluators in using this method is
not consistent with the users’ experience with a system. Despite this, the results
obtained are useful to improve the system and prepare the next phase with real users.
Experts detected problems associated with most of the heuristic rules, although there
are significant differences between the experts. The most significant usability problem
is associated with HR10 (Help and documentation); there is no documentation or users’
support available in the online store. Besides, several problems were detected in HR8
(Aesthetic and minimalist design) by both experts, most of them related to the lack of
products and images associated with the available products.
Finally, the findings of this study have a number of important implications for
improving the development of e-commerce platform adapted to users with different
abilities. On the other hand, future works need to be done to complete the usability
study with qualitative techniques such as focus group with final users with different
abilities and mental illness.
Acknowledgments. This work has been partially funded by the Spanish Ministry of Economy
and Competitiveness throughout the DEFINES project (Ref. TIN2016-80172-R) and the Min-
istry of Education of the Junta de Castilla y León (Spain) throughout the TE-CUIDA project
(Ref. SA061P17).
References
1. Kalache, A., Gatti, A.: Active ageing: a policy framework. Adv. Gerontol. Uspekhi
Gerontol. Akad. Nauk. Gerontol. Obs. 11, 7–18 (2003)
2. WHO: Active Ageing: A Policy Framework. World Health Organization, Geneva (2002)
3. Cámara, C.P., Eguizábal, A.J.: Quality of university programs for older people in Spain:
innovations, tendencies, and ethics in European higher education. Educ. Gerontol. 34, 328–
354 (2008)
4. Stompór, M., Grodzicki, T., Stompór, T., Wordliczek, J., Dubiel, M., Kurowska, I.:
Prevalence of chronic pain, particularly with neuropathic component, and its effect on overall
functioning of elderly patients. Med. Sci. Monit.: Int. Med. J. Exp. Clin. Res. 25, 2695–2701
(2019)
5. García-Holgado, A., Marcos-Pablos, S., García-Peñalvo, F.J.: A model to define an eHealth
technological ecosystem for caregivers. In: Rocha, Á., Adeli, H., Reis, L., Costanzo, S. (eds.)
New Knowledge in Information Systems and Technologies. WorldCIST 2019. Advances in
Intelligent Systems and Computing, vol. 932, pp. 422–432. Springer, Cham (2019)
6. García-Holgado, A., García-Peñalvo, F.J.: Architectural pattern to improve the definition and
implementation of eLearning ecosystems. Sci. Comput. Program. 129, 20–34 (2016)
7. Malla, A., Joober, R., Garcia, A.: “Mental illness is like any other medical illness”: a critical
examination of the statement and its impact on patient care and society. J. Psychiatry
Neurosci.: JPN 40, 147–150 (2015)
8. Gonçalves, R., Rocha, T., Martins, J., Branco, F., Au-Yong-Oliveira, M.: Evaluation of e-
commerce websites accessibility and usability: an e-commerce platform analysis with the
inclusion of blind users. Univ. Access Inf. Soc. 17, 567–583 (2018)
9. Manikas, K., Hansen, K.M.: Software ecosystems – a systematic literature review. J. Syst.
Softw. 86, 1294–1306 (2013)
10. Pillai, K., King, H., Ozansoy, C.: Hierarchy model to develop and simulate digital habitat
ecosystem architecture. In: 2012 IEEE Student Conference on Research and Development
(SCOReD). IEEE, USA (2012)
11. Ostadzadeh, S.S., Shams, F., Badie, K.: An architectural model framework to improve
digital ecosystems interoperability. In: Elleithy, K., Sobh, T. (eds.) New Trends in
Networking, Computing, E-learning, Systems Sciences, and Engineering. Lecture Notes in
Electrical Engineering, vol. 312, pp. 513–520. Springer, Cham (2015)
12. García-Holgado, A.: Análisis de integración de soluciones basadas en software como
servicio para la implantación de ecosistemas tecnológicos educativos. Programa de
Doctorado en Formación en la Sociedad del Conocimiento. University of Salamanca,
Salamanca, Spain (2018)
13. García-Peñalvo, F.J., Franco Martín, M., García-Holgado, A., Toribio Guzmán, J.M., Largo
Antón, J., Sánchez-Gómez, M.C.: Psychiatric patients tracking through a private social
network for relatives: development and pilot study. J. Med. Syst. 40 (2016). Article no. 172
14. Marcos-Pablos, S., García-Holgado, A., García-Peñalvo, F.J.: Modelling the business
structure of a digital health ecosystem. In: Conde-González, M.Á., Rodríguez Sedano, F.J.,
Fernández Llamas, C., García-Peñalvo, F.J. (eds.) Proceedings of the 7th International
Conference on Technological Ecosystems for Enhancing Multiculturality, TEEM 2019,
León, Spain, 16–18 October 2019, pp. 838–846. ACM, New York (2019)
15. Geldmacher, D.S., Kirson, N.Y., Birnbaum, H.G., Eapen, S., Kantor, E., Cummings, A.K.,
Joish, V.N.: Implications of early treatment among Medicaid patients with Alzheimer’s
disease. Alzheimer’s Dement. 10, 214–224 (2014)
16. Ostwald, S.K., Hepburn, K.W., Caron, W., Burns, T., Mantell, R.: Reducing caregiver
burden: a randomized psychoeducational intervention for caregivers of persons with
dementia. Gerontologist 39, 299–309 (1999)
17. Diffenderfer, P.M., El-Assal, S.: Microsoft Dynamics NAV: Jump Start to Optimization.
Vieweg+Teubner Verlag (2008)
18. W3C: Web Content Accessibility Guidelines (WCAG) 2.0 (2008)
19. Cognitive Accessibility at W3C. Web Accessibility Initiative (WAI). http://bit.ly/2QSj2sG
20. Nielsen, J.: Heuristic evaluation. In: Nielsen, J., Mack, R.L. (eds.) Usability Inspection
Methods, vol. 17, pp. 25–62. Wiley, Hoboken (1994)
21. Khajouei, R., Ameri, A., Jahani, Y.: Evaluating the agreement of users with usability
problems identified by heuristic evaluation. Int. J. Med. Inform. 117, 13–18 (2018)
Author Index
A C
Abelha, António, 466, 476, 484, 503, 510 Cădar, Ionuț Dan, 307
Abnane, Ibtissam, 15 Canaleta, Xavi, 570
Agredo-Delgado, Vanessa, 203 Cardoso, Henrique Lopes, 108
Aguiar, Joyce, 108 Carneiro, João, 54
Akyar, Özgür Yaşar, 357, 367, 377, 397 Carrillo-de-Gea, Juan Manuel, 25
Alami, Hassan, 36, 86 Carvalho, Victor, 108
Aldhayan, Manal, 95 Caussin, Bernardo, 397
Ali, Raian, 95 Cham, Sainabou, 95
Almeida, Ana, 54 Chamba, Franklin, 137
Almourad, Mohamed Basel, 95 Chevereșan, Romulus, 429
Alvarez, Gustavo, 137 Collazos, Cesar A., 203
Alves, Victor, 441, 452, 493 Colmenero Ruiz, María Jesús Yolanda, 245
Amato, Cibelle, 387 Costa Tavares, João A., 590
Araújo, Miguel, 3 Costanzo, Sandra, 287
Arias, Susana, 137 Costas Jauregui, Vladimir, 357, 367, 397
Au-Yong-Oliveira, Manuel, 3, 590 Crnojević, Vladimir, 544
Aydin, Mehmet N., 531 Cunha, Carlos R., 579
B
Bachiri, Mariam, 36 D
Badia, David, 570 De Weerdt, Jochen, 523
Bădică, Amelia, 192 Demirhan, Gıyasettin, 367
Bădică, Costin, 192
Balaban, Igor, 152
Barroso, João, 3 E
Bergande, Bianca, 142 Egorova, Olga, 235
Berrios Aguayo, Beatriz, 245 El Asnaoui, Khalid, 44
Bin Qushem, Umar, 357 Eliseo, Maria Amelia, 387, 397
Boitsev, Anton, 235 Encinas, A. H., 295
Brdar, Sanja, 544 Estrada, Rogelio, 120
Bylieva, Daria, 225 Ezzat, Mahmoud, 65
https://doi.org/10.1007/978-3-030-45697-9
614 Author Index
F M
Fardoun, Habib M., 203 Machado, Joana, 452
Faria, Brígida Mónica, 108 Machado, José, 466, 510
Fernandes, Catarina, 466 Magalhães, Ricardo, 493
Fernandes, Filipe, 452 Marcos-Pablos, Samuel, 600
Fernandes, Gisela, 334 Marinheiro, Miguel, 590
Fernandes, Joana, 579 Marques, Gonçalo, 76
Fernández-Alemán, José Luis, 25 Marreiros, Goreti, 54
Ferreira, Diana, 510 Martín-Vaquero, Jesús, 295
Flores, Marcelo, 367 Martin, Anne, 142
Fonseca, David, 570 Martínez Nova, Alfonso, 295
Fonseca, Luís, 3 Martinho, Diogo, 54
Frazão, Rui, 3 Martins, Constantino, 54
Freitas, Francisco, 317 Martins, Valéria Farinazzo, 387, 397
McAlaney, John, 95
G Meissner, Roy, 142
García-Berná, José Alberto, 25 Mejía, Jezreel, 120
García-Holgado, Alicia, 347, 600 Mikhailova, Elena, 235
García-Peñalvo, Francisco J., 409, 600 Miranda, Filipe, 452
Gomes, João Pedro, 579 Mitrović, Sandra, 523
Gómez, Héctor, 137 Mon, Alicia, 203
Gonçalves, Helena, 108 Morais, Elisabete Paulo, 579
Gonçalves, Joaquim, 108 Moreira, Fernando, 203
Gonçalves, Ramiro, 557 Motz, Regina, 357, 397, 418
Govedarica, Miro, 544 Mounir, Fouad, 253
Grafeeva, Natalia, 235 Munoz, Darwin, 357
Grujić, Nastasija, 544 Murareţu, Ionuţ Dorinel, 192
Guimarães, Tiago, 476, 484, 503 Murtonen, Mari, 215
H
N
Hak, Francini, 476, 484
Nabil, Attari, 253
Hakkoum, Hajar, 15
Nafil, Khalid, 253
Hwang, Ting-Kai, 175
Neves, José, 452
Nicolás, Joaquín, 25
I
Novović, Olivera, 544
Idri, Ali, 15, 36, 44, 65, 86
Istrate, Cristiana, 429
Ivanović, Mirjana, 192 O
Oliveira, Alexandra, 108
J Ortega Vázquez, Carlos, 523
Jesus, Tiago, 493 Ouhbi, Sofia, 25
Jin, Bih-Huang, 175 Oyelere, Solomon Sunday, 357, 367, 387, 397
Jorge, Filipa, 557
P
K Paiva, Sandra, 271
Kharbouch, Manal, 86 Pantoja Vallejo, Antonio, 245
Knihs, Everton, 347 Peixoto, Rui, 317
Peraza, Juan, 120
L Perdahci, Ziya N., 531
Laato, Samuli, 215 Petre, Ioana, 429
Labrador, Emiliano, 570 Pitarma, Rui, 76
Lizarraga, Carmen, 120 Popescu, Daniela, 192
Lobatyuk, Victoria, 225 Portela, Carlos Filipe, 317
Luís, Ana R., 263 Portela, Filipe, 334, 466
Author Index 615
Q Suciu, George, 429

Queirós, Ricardo, 327 Suhonen, Jarkko, 397
Queiruga-Dios, A., 295
Quiñonez, Yadira, 120
Quintal, Miguel, 503 T
Qureshi, Adil Masoud, 287 Teixeira, Mário Sérgio, 557
Teófilo, Luís, 108
R Therón, Roberto, 409
Rachad, Taoufik, 36 Tolpygin, Sergei, 225
Rachad, Taoufiq, 86 Tomczyk, Łukasz, 357, 387, 397
Ramos, Ana, 441 Torreblanca González, José, 295
Raquel, Lia, 271
Redman, Leanne, 86
U
Redman, Leanne M., 36
Ungureanu, Cristinel, 192
Reis, Luís Paulo, 108, 271
Ribeiro, Jorge, 452
V
Rodés, Virginia, 418
van Hillegersberg, J. (Jos), 531
Romanov, Aleksei, 235
vanden Broucke, Seppe, 523
Rubtsova, Anna, 225
Vázquez-Ingelmo, Andrea, 409
Ruiz, Pablo H., 203
Vicente, Dinis, 452
Vicente, Henrique, 452
S
Vieira, Ana, 54
Safak, I., 531
Villa, Guillem, 570
Sanchez, Gloria, 357
Villegas, Eva, 570
Santos, Manuel, 317, 476, 484
Volchek, Dmitriy, 235
Santos, Manuel Filipe, 334, 466, 503
Scheianu, Andrei, 429
W
Segărceanu, Svetlana, 429
Wang, Su-Chiu, 175
Silva, Eliana, 108
Şimşek, Burcu, 377
Silveira, Ismar Frango, 387, 397 Z
Sobral, Sónia Rolland, 162, 182 Zaragoza, Rafel, 570
Sousa, Regina, 510 Zatarain, Oscar, 120
Stelate, Youssef, 86 Zerouaoui, Hasnae, 44
Strilețchi, Cosmin, 307 Zlatović, Miran, 152

(Brazil) Kinh Nghiệm Phát Triển

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

(Brazil) Kinh Nghiệm Phát Triển

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Brazil) Kinh Nghiệm Phát Triển

Uploaded by

Copyright:

Available Formats

Advances in Intelligent Systems and Computing 1161

Álvaro Rocha · Hojjat Adeli ·

More information about this series at http://www.springer.com/series/11156

Luís Paulo Reis Sandra Costanzo

Irena Orovic Fernando Moreira

Trends and Innovations

Luís Paulo Reis Sandra Costanzo

Irena Orovic Fernando Moreira

ISSN 2194-5357 ISSN 2194-5365 (electronic)

Education; (M) Information Technologies in Radiocommunications;

April 2020 Álvaro Rocha

Local Organizing Committee

Janusz Kacprzyk Polish Academy of Sciences, Poland

Anis Tissaoui University of Jendouba, Tunisia

Faisal Musa Abbas Abubakar Tafawa Balewa University Bauchi,

Imen Ben Said Université de Sfax, Tunisia

Kamel Rouibah Kuwait University, Kuwait

Marijana Despotovic-Zrakic Faculty Organizational Science, Serbia

Paulvanna Nayaki Marimuthu Kuwait University, Kuwait

Silviu Vert Politehnica University of Timisoara, Romania

Enabling Smart Homes Through Health Informatics

Information Technologies in Education

On the Role of Python in Programming-Related Courses

Information Technologies in Radiocommunications

Technologies for Biomedical Applications

Sensitive Mannequin for Practicing the Locomotor Apparatus

Pervasive Information Systems

Inclusive Education through ICT

Aggregation Bias: A Proposal to Raise Awareness Regarding

Intelligent Systems and Machines

International Workshop on Healthcare Information Systems

Network Modeling, Learning and Analysis

Innovative Technologies Applied to Rural Regions

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613

Luís Fonseca1, João Barroso1, Miguel Araújo1, Rui Frazão1,

Keywords: Healthcare mHealth Wearable technology Biomedical

The healthcare system is intended to efﬁciently provide healthcare services so as to

understandingly be capable of improving healthcare systems by taking advantage of the

2.1 Wearable Devices

2.2 Data Processing and Analytics

2.3 Enterprises and Services

4.1 SmartAL – Software Platform Developed by Altice Labs

Fig. 1. Knowledge about wearable devices

5 Discussion of a New Possible Solution for the General

Fig. 2. Conceptual architecture of a healthcare system

7 Contribution and Suggestions for Future Research

The solution set forth is an incremental contribution, in terms of the innovation

Hajar Hakkoum1 , Ali Idri1,2(B) , and Ibtissam Abnane1

Abstract. Breast Cancer (BC) is the most common type of cancer

Keywords: Interpretability · Breast Cancer · Diagnosis · LIME

Breast cancer is a phenotypically diverse population of breast cancer cells [1].

2.1 Artificial Neural Networks: MLP

arbitrary number of hidden layers. Each layer is a set of neurons interconnected

2.2 Local Interpretability

explanation(x) = arg min L(f, g, πx ) + Ω(g) (1)

The explanation model for an instance x is the model g (linear regression)

4 Database Description and Performance Criteria