Next Issue
Volume 10, April
Previous Issue
Volume 10, February
 
 

Information, Volume 10, Issue 3 (March 2019) – 38 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
20 pages, 489 KiB  
Article
Privacy-Preserving Secure Computation of Skyline Query in Distributed Multi-Party Databases
by Mahboob Qaosar, Asif Zaman, Md. Anisuzzaman Siddique, Annisa and Yasuhiko Morimoto
Information 2019, 10(3), 119; https://doi.org/10.3390/info10030119 - 25 Mar 2019
Cited by 14 | Viewed by 3951
Abstract
Selecting representative objects from a large-scale database is an essential task to understand the database. A skyline query is one of the popular methods for selecting representative objects. It retrieves a set of non-dominated objects. In this paper, we consider a distributed algorithm [...] Read more.
Selecting representative objects from a large-scale database is an essential task to understand the database. A skyline query is one of the popular methods for selecting representative objects. It retrieves a set of non-dominated objects. In this paper, we consider a distributed algorithm for computing skyline, which is efficient enough to handle “big data”. We have noticed the importance of “big data” and want to use it. On the other hand, we must take care of its privacy. In conventional distributed algorithms for computing a skyline query, we must disclose the sensitive values of each object of a private database to another for comparison. Therefore, the privacy of the objects is not preserved. However, such disclosures of sensitive information in conventional distributed database systems are not allowed in the modern privacy-aware computing environment. Recently several privacy-preserving skyline computation frameworks have been introduced. However, most of them use computationally expensive secure comparison protocol for comparing homomorphically encrypted data. In this work, we propose a novel and efficient approach for computing the skyline in a secure multi-party computing environment without disclosing the individual attributes’ value of the objects. We use a secure multi-party sorting protocol that uses the homomorphic encryption in the semi-honest adversary model for transforming each attribute value of the objects without changing their order on each attribute. To compute skyline we use the order of the objects on each attribute for comparing the dominance relationship among the objects. The security analysis confirms that the proposed framework can achieve multi-party skyline computation without leaking the sensitive attribute value to others. Besides that, our experimental results also validate the effectiveness and scalability of the proposed privacy-preserving skyline computation framework. Full article
Show Figures

Figure 1

21 pages, 4639 KiB  
Article
Tangled String for Multi-Timescale Explanation of Changes in Stock Market
by Yukio Ohsawa, Teruaki Hayashi and Takaaki Yoshino
Information 2019, 10(3), 118; https://doi.org/10.3390/info10030118 - 22 Mar 2019
Cited by 3 | Viewed by 5607
Abstract
This work addresses the question of explaining changes in the desired timescales of the stock market. Tangled string is a sequence visualization tool wherein a sequence is compared to a string and trends in the sequence are compared to the appearance of tangled [...] Read more.
This work addresses the question of explaining changes in the desired timescales of the stock market. Tangled string is a sequence visualization tool wherein a sequence is compared to a string and trends in the sequence are compared to the appearance of tangled pills and wires bridging the pills in the string. Here, the tangled string is extended and applied to detecting stocks that trigger changes and explaining trend changes in the market. Sequential data for 11 years from the First Section of the Tokyo Stock Exchange regarding top-10 stocks with weekly increase rates are visualized using the tangled string. It was found that the change points obtained by the tangled string coincided well with changes in the average prices of listed stocks, and changes in the price of each stock are visualized on the string. Thus, changes in stock prices, which vary across a mixture of different timescales, could be explained in the time scale corresponding to interest in stock analysis. The tangled string was created using a data-driven innovation platform called Innovators Marketplace on Data Jackets, and is extended to satisfy data users here, so this study verifies the contribution of data market to data-driven innovation. Full article
(This article belongs to the Special Issue MoDAT: Designing the Market of Data)
Show Figures

Figure 1

4 pages, 179 KiB  
Editorial
eHealth and Artificial Intelligence
by Donato Impedovo and Giuseppe Pirlo
Information 2019, 10(3), 117; https://doi.org/10.3390/info10030117 - 19 Mar 2019
Cited by 5 | Viewed by 5301
Abstract
Artificial intelligence is changing the healthcare industry from many perspectives: diagnosis, treatment, and follow-up. A wide range of techniques has been proposed in the literature. In this special issue, 13 selected and peer-reviewed original research articles contribute to the application of artificial intelligence [...] Read more.
Artificial intelligence is changing the healthcare industry from many perspectives: diagnosis, treatment, and follow-up. A wide range of techniques has been proposed in the literature. In this special issue, 13 selected and peer-reviewed original research articles contribute to the application of artificial intelligence (AI) approaches in various real-world problems. Papers refer to the following main areas of interest: feature selection, high dimensionality, and statistical approaches; heart and cardiovascular diseases; expert systems and e-health platforms. Full article
(This article belongs to the Special Issue eHealth and Artificial Intelligence)
13 pages, 2280 KiB  
Article
A Self-Learning Fault Diagnosis Strategy Based on Multi-Model Fusion
by Tianzhen Wang, Jingjing Dong, Tao Xie, Demba Diallo and Mohamed Benbouzid
Information 2019, 10(3), 116; https://doi.org/10.3390/info10030116 - 17 Mar 2019
Cited by 15 | Viewed by 4273
Abstract
This paper presents an approach to detect and classify the faults in complex systems with small amounts of available data history. The methodology is based on the model fusion for fault detection and classification. Moreover, the database is enriched with additional samples if [...] Read more.
This paper presents an approach to detect and classify the faults in complex systems with small amounts of available data history. The methodology is based on the model fusion for fault detection and classification. Moreover, the database is enriched with additional samples if they are correctly classified. For the fault detection, the kernel principal component analysis (KPCA), kernel independent component analysis (KICA) and support vector domain description (SVDD) were used and combined with a fusion operator. For the classification, extreme learning machine (ELM) was used with different activation functions combined with an average fusion function. The performance of the methodology was evaluated with a set of experimental vibration data collected from a test-to-failure bearing test rig. The results show the effectiveness of the proposed approach compared to conventional methods. The fault detection was achieved with a false alarm rate of 2.29% and a null missing alarm rate. The data is also successfully classified with a rate of 99.17%. Full article
(This article belongs to the Special Issue Fault Diagnosis, Maintenance and Reliability)
Show Figures

Figure 1

21 pages, 3607 KiB  
Article
An Efficient Image Reconstruction Framework Using Total Variation Regularization with Lp-Quasinorm and Group Gradient Sparsity
by Fan Lin, Yingpin Chen, Lingzhi Wang, Yuqun Chen, Wei Zhu and Fei Yu
Information 2019, 10(3), 115; https://doi.org/10.3390/info10030115 - 16 Mar 2019
Cited by 3 | Viewed by 4083
Abstract
The total variation (TV) regularization-based methods are proven to be effective in removing random noise. However, these solutions usually have staircase effects. This paper proposes a new image reconstruction method based on TV regularization with Lp-quasinorm and group gradient sparsity. In this method, [...] Read more.
The total variation (TV) regularization-based methods are proven to be effective in removing random noise. However, these solutions usually have staircase effects. This paper proposes a new image reconstruction method based on TV regularization with Lp-quasinorm and group gradient sparsity. In this method, the regularization term of the group gradient sparsity can retrieve the neighborhood information of an image gradient, and the Lp-quasinorm constraint can characterize the sparsity of the image gradient. The method can effectively deblur images and remove impulse noise to well preserve image edge information and reduce the staircase effect. To improve the image recovery efficiency, a Fast Fourier Transform (FFT) is introduced to effectively avoid large matrix multiplication operations. Moreover, by introducing accelerated alternating direction method of multipliers (ADMM) in the method to allow for a fast restart of the optimization process, this method can run faster. In numerical experiments on standard test images sourced form Emory University and CVG-UGR (Computer Vision Group, University of Granada) image database, the advantage of the new method is verified by comparing it with existing advanced TV-based methods in terms of peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and operational time. Full article
(This article belongs to the Section Information Processes)
Show Figures

Figure 1

14 pages, 5511 KiB  
Article
Estimating Spatiotemporal Information from Behavioral Sensing Data of Wheelchair Users by Machine Learning Technologies
by Ikuko Eguchi Yairi, Hiroki Takahashi, Takumi Watanabe, Kouya Nagamine, Yusuke Fukushima, Yutaka Matsuo and Yusuke Iwasawa
Information 2019, 10(3), 114; https://doi.org/10.3390/info10030114 - 15 Mar 2019
Cited by 5 | Viewed by 3899
Abstract
Recent expansion of intelligent gadgets, such as smartphones and smart watches, familiarizes humans with sensing their activities. We have been developing a road accessibility evaluation system inspired by human sensing technologies. This paper introduces our methodology to estimate road accessibility from the three-axis [...] Read more.
Recent expansion of intelligent gadgets, such as smartphones and smart watches, familiarizes humans with sensing their activities. We have been developing a road accessibility evaluation system inspired by human sensing technologies. This paper introduces our methodology to estimate road accessibility from the three-axis acceleration data obtained by a smart phone attached on a wheelchair seat, such as environmental factors, e.g., curbs and gaps, which directly influence wheelchair bodies, and human factors, e.g., wheelchair users’ feelings of tiredness and strain. Our goal is to realize a system that provides the road accessibility visualization services to users by online/offline pattern matching using impersonal models, while gradually learning to improve service accuracy using new data provided by users. As the first step, this paper evaluates features acquired by the DCNN (deep convolutional neural network), which learns the state of the road surface from the data in supervised machine learning techniques. The evaluated results show that the features can capture the difference of the road surface condition in more detail than the label attached by us and are effective as the means for quantitatively expressing the road surface condition. This paper developed and evaluated a prototype system that estimated types of ground surfaces focusing on knowledge extraction and visualization. Full article
(This article belongs to the Special Issue MoDAT: Designing the Market of Data)
Show Figures

Figure 1

20 pages, 1432 KiB  
Article
An Artificial Neural Network Approach to Forecast the Environmental Impact of Data Centers
by Joao Ferreira, Gustavo Callou, Albert Josua, Dietmar Tutsch and Paulo Maciel
Information 2019, 10(3), 113; https://doi.org/10.3390/info10030113 - 14 Mar 2019
Cited by 18 | Viewed by 5718
Abstract
Due to the high demands of new technologies such as social networks, e-commerce and cloud computing, more energy is being consumed in order to store all the data produced and provide the high availability required. Over the years, this increase in energy consumption [...] Read more.
Due to the high demands of new technologies such as social networks, e-commerce and cloud computing, more energy is being consumed in order to store all the data produced and provide the high availability required. Over the years, this increase in energy consumption has brought about a rise in both the environmental impacts and operational costs. Some companies have adopted the concept of a green data center, which is related to electricity consumption and CO2 emissions, according to the utility power source adopted. In Brazil, almost 70% of electrical power is derived from clean electricity generation, whereas in China 65% of generated electricity comes from coal. In addition, the value per kWh in the US is much lower than in other countries surveyed. In the present work, we conducted an integrated evaluation of costs and CO2 emissions of the electrical infrastructure in data centers, considering the different energy sources adopted by each country. We used a multi-layered artificial neural network, which could forecast consumption over the following months, based on the energy consumption history of the data center. All these features were supported by a tool, the applicability of which was demonstrated through a case study that computed the CO2 emissions and operational costs of a data center using the energy mix adopted in Brazil, China, Germany and the US. China presented the highest CO2 emissions, with 41,445 tons per year in 2014, followed by the US and Germany, with 37,177 and 35,883, respectively. Brazil, with 8459 tons, proved to be the cleanest. Additionally, this study also estimated the operational costs assuming that the same data center consumes energy as if it were in China, Germany and Brazil. China presented the highest kWh/year. Therefore, the best choice according to operational costs, considering the price of energy per kWh, is the US and the worst is China. Considering both operational costs and CO2 emissions, Brazil would be the best option. Full article
(This article belongs to the Special Issue Fault Diagnosis, Maintenance and Reliability)
Show Figures

Figure 1

20 pages, 4178 KiB  
Article
Towards an Efficient Data Fragmentation, Allocation, and Clustering Approach in a Distributed Environment
by Hassan Abdalla and Abdel Monim Artoli
Information 2019, 10(3), 112; https://doi.org/10.3390/info10030112 - 12 Mar 2019
Cited by 15 | Viewed by 4447
Abstract
Data fragmentation and allocation has for long proven to be an efficient technique for improving the performance of distributed database systems’ (DDBSs). A crucial feature of any successful DDBS design revolves around placing an intrinsic emphasis on minimizing transmission costs (TC). This work; [...] Read more.
Data fragmentation and allocation has for long proven to be an efficient technique for improving the performance of distributed database systems’ (DDBSs). A crucial feature of any successful DDBS design revolves around placing an intrinsic emphasis on minimizing transmission costs (TC). This work; therefore, focuses on improving distribution performance based on transmission cost minimization. To do so, data fragmentation and allocation techniques are utilized in this work along with investigating several data replication scenarios. Moreover, site clustering is leveraged with the aim of producing a minimum possible number of highly balanced clusters. By doing so, TC is proved to be immensely reduced, as depicted in performance evaluation. DDBS performance is measured using TC objective function. An inclusive evaluation has been made in a simulated environment, and the compared results have demonstrated the superiority and efficacy of the proposed approach on reducing TC. Full article
Show Figures

Graphical abstract

13 pages, 1405 KiB  
Article
Content-Aware Retargeted Image Quality Assessment
by Tingting Zhang, Ming Yu, Yingchun Guo and Yi Liu
Information 2019, 10(3), 111; https://doi.org/10.3390/info10030111 - 12 Mar 2019
Cited by 5 | Viewed by 3691
Abstract
In targeting the low correlation between existing image scaling quality assessment methods and subjective awareness, a content-aware retargeted image quality assessment algorithm is proposed, which is based on the structural similarity index. In this paper, a similarity index, that is, a local structural [...] Read more.
In targeting the low correlation between existing image scaling quality assessment methods and subjective awareness, a content-aware retargeted image quality assessment algorithm is proposed, which is based on the structural similarity index. In this paper, a similarity index, that is, a local structural similarity algorithm, which can measure different sizes of the same image is proposed. The Speed Up Robust Feature (SURF) algorithm is used to extract the local structural similarity and the image content loss degree. The significant area ratio is calculated by extracting the saliency region and the retargeted image quality assessment function is obtained by linear fusion. In the CUHK image database and the MIT RetargetMe database, compared with four representative assessment algorithms and other latest four kinds of retargeted image quality assessment algorithms, the experiment proves that the proposed algorithm has a higher correlation with Mean Opinion Score (MOS) values and corresponds with the result of human subjective assessment. Full article
(This article belongs to the Section Information Processes)
Show Figures

Figure 1

16 pages, 289 KiB  
Article
Machine Learning Models for Error Detection in Metagenomics and Polyploid Sequencing Data
by Milko Krachunov, Maria Nisheva and Dimitar Vassilev
Information 2019, 10(3), 110; https://doi.org/10.3390/info10030110 - 11 Mar 2019
Cited by 5 | Viewed by 3757
Abstract
Metagenomics studies, as well as genomics studies of polyploid species such as wheat, deal with the analysis of high variation data. Such data contain sequences from similar, but distinct genetic chains. This fact presents an obstacle to analysis and research. In particular, the [...] Read more.
Metagenomics studies, as well as genomics studies of polyploid species such as wheat, deal with the analysis of high variation data. Such data contain sequences from similar, but distinct genetic chains. This fact presents an obstacle to analysis and research. In particular, the detection of instrumentation errors during the digitalization of the sequences may be hindered, as they can be indistinguishable from the real biological variation inside the digital data. This can prevent the determination of the correct sequences, while at the same time make variant studies significantly more difficult. This paper details a collection of ML-based models used to distinguish a real variant from an erroneous one. The focus is on using this model directly, but experiments are also done in combination with other predictors that isolate a pool of error candidates. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Show Figures

Figure 1

13 pages, 253 KiB  
Article
An Experimental Comparison of Feature-Selection and Classification Methods for Microarray Datasets
by Nicole Dalia Cilia, Claudio De Stefano, Francesco Fontanella, Stefano Raimondo and Alessandra Scotto di Freca
Information 2019, 10(3), 109; https://doi.org/10.3390/info10030109 - 10 Mar 2019
Cited by 36 | Viewed by 4753
Abstract
In the last decade, there has been a growing scientific interest in the analysis of DNA microarray datasets, which have been widely used in basic and translational cancer research. The application fields include both the identification of oncological subjects, separating them from the [...] Read more.
In the last decade, there has been a growing scientific interest in the analysis of DNA microarray datasets, which have been widely used in basic and translational cancer research. The application fields include both the identification of oncological subjects, separating them from the healthy ones, and the classification of different types of cancer. Since DNA microarray experiments typically generate a very large number of features for a limited number of patients, the classification task is very complex and typically requires the application of a feature-selection process to reduce the complexity of the feature space and to identify a subset of distinctive features. In this framework, there are no standard state-of-the-art results generally accepted by the scientific community and, therefore, it is difficult to decide which approach to use for obtaining satisfactory results in the general case. Based on these considerations, the aim of the present work is to provide a large experimental comparison for evaluating the effect of the feature-selection process applied to different classification schemes. For comparison purposes, we considered both ranking-based feature-selection techniques and state-of-the-art feature-selection methods. The experiments provide a broad overview of the results obtainable on standard microarray datasets with different characteristics in terms of both the number of features and the number of patients. Full article
(This article belongs to the Special Issue eHealth and Artificial Intelligence)
34 pages, 1508 KiB  
Review
A Review on Energy Consumption Optimization Techniques in IoT Based Smart Building Environments
by Abdul Salam Shah, Haidawati Nasir, Muhammad Fayaz, Adidah Lajis and Asadullah Shah
Information 2019, 10(3), 108; https://doi.org/10.3390/info10030108 - 8 Mar 2019
Cited by 118 | Viewed by 13313
Abstract
In recent years, due to the unnecessary wastage of electrical energy in residential buildings, the requirement of energy optimization and user comfort has gained vital importance. In the literature, various techniques have been proposed addressing the energy optimization problem. The goal of each [...] Read more.
In recent years, due to the unnecessary wastage of electrical energy in residential buildings, the requirement of energy optimization and user comfort has gained vital importance. In the literature, various techniques have been proposed addressing the energy optimization problem. The goal of each technique is to maintain a balance between user comfort and energy requirements, such that the user can achieve the desired comfort level with the minimum amount of energy consumption. Researchers have addressed the issue with the help of different optimization algorithms and variations in the parameters to reduce energy consumption. To the best of our knowledge, this problem is not solved yet due to its challenging nature. The gaps in the literature are due to advancements in technology, the drawbacks of optimization algorithms, and the introduction of new optimization algorithms. Further, many newly proposed optimization algorithms have produced better accuracy on the benchmark instances but have not been applied yet for the optimization of energy consumption in smart homes. In this paper, we have carried out a detailed literature review of the techniques used for the optimization of energy consumption and scheduling in smart homes. Detailed discussion has been carried out on different factors contributing towards thermal comfort, visual comfort, and air quality comfort. We have also reviewed the fog and edge computing techniques used in smart homes. Full article
(This article belongs to the Special Issue Smart Energy Grid Engineering)
Show Figures

Figure 1

12 pages, 2296 KiB  
Article
Matrix-Based Method for Inferring Elements in Data Attributes Using a Vector Space Model
by Teruaki Hayashi and Yukio Ohsawa
Information 2019, 10(3), 107; https://doi.org/10.3390/info10030107 - 8 Mar 2019
Cited by 3 | Viewed by 3714
Abstract
This article addresses the task of inferring elements in the attributes of data. Extracting data related to our interests is a challenging task. Although data on the web can be accessed through free text queries, it is difficult to obtain results that accurately [...] Read more.
This article addresses the task of inferring elements in the attributes of data. Extracting data related to our interests is a challenging task. Although data on the web can be accessed through free text queries, it is difficult to obtain results that accurately correspond to user intentions because users might not express their objects of interest using exact terms (variables, outlines of data, etc.) found in the data. In other words, users do not always have sufficient knowledge of the data to formulate an effective query. Hence, we propose a method that enables the type, format, and variable elements to be inferred as attributes of data when a natural language summary of the data is provided as a free text query. To evaluate the proposed method, we used the Data Jacket’s datasets whose metadata is written in natural language. The experimental results indicate that our method outperforms those obtained from string matching and word embedding. Applications based on this study can support users who wish to retrieve or acquire new data. Full article
(This article belongs to the Special Issue MoDAT: Designing the Market of Data)
Show Figures

Figure 1

17 pages, 8869 KiB  
Article
SDN-Based Intrusion Detection System for Early Detection and Mitigation of DDoS Attacks
by Pedro Manso, José Moura and Carlos Serrão
Information 2019, 10(3), 106; https://doi.org/10.3390/info10030106 - 8 Mar 2019
Cited by 103 | Viewed by 12102
Abstract
The current paper addresses relevant network security vulnerabilities introduced by network devices within the emerging paradigm of Internet of Things (IoT) as well as the urgent need to mitigate the negative effects of some types of Distributed Denial of Service (DDoS) attacks that [...] Read more.
The current paper addresses relevant network security vulnerabilities introduced by network devices within the emerging paradigm of Internet of Things (IoT) as well as the urgent need to mitigate the negative effects of some types of Distributed Denial of Service (DDoS) attacks that try to explore those security weaknesses. We design and implement a Software-Defined Intrusion Detection System (IDS) that reactively impairs the attacks at its origin, ensuring the “normal operation” of the network infrastructure. Our proposal includes an IDS that automatically detects several DDoS attacks, and then as an attack is detected, it notifies a Software Defined Networking (SDN) controller. The current proposal also downloads some convenient traffic forwarding decisions from the SDN controller to network devices. The evaluation results suggest that our proposal timely detects several types of cyber-attacks based on DDoS, mitigates their negative impacts on the network performance, and ensures the correct data delivery of normal traffic. Our work sheds light on the programming relevance over an abstracted view of the network infrastructure to timely detect a Botnet exploitation, mitigate malicious traffic at its source, and protect benign traffic. Full article
(This article belongs to the Special Issue Insider Attacks)
Show Figures

Figure 1

22 pages, 4073 KiB  
Article
Hybrid LSTM Neural Network for Short-Term Traffic Flow Prediction
by Yuelei Xiao and Yang Yin
Information 2019, 10(3), 105; https://doi.org/10.3390/info10030105 - 8 Mar 2019
Cited by 48 | Viewed by 6703
Abstract
The existing short-term traffic flow prediction models fail to provide precise prediction results and consider the impact of different traffic conditions on the prediction results in an actual traffic network. To solve these problems, a hybrid Long Short–Term Memory (LSTM) neural network is [...] Read more.
The existing short-term traffic flow prediction models fail to provide precise prediction results and consider the impact of different traffic conditions on the prediction results in an actual traffic network. To solve these problems, a hybrid Long Short–Term Memory (LSTM) neural network is proposed, based on the LSTM model. Then, the structure and parameters of the hybrid LSTM neural network are optimized experimentally for different traffic conditions, and the final model is compared with the other typical models. It is found that the prediction error of the hybrid LSTM model is obviously less than those of the other models, but the running time of the hybrid LSTM model is only slightly longer than that of the LSTM model. Based on the hybrid LSTM model, the vehicle flows of each road section and intersection in the actual traffic network are further predicted. The results show that the maximum relative error between the actual and predictive vehicle flows of each road section is 1.03%, and the maximum relative error between the actual and predictive vehicle flows of each road intersection is 1.18%. Hence, the hybrid LSTM model is closer to the accuracy and real-time requirements of short-term traffic flow prediction, and suitable for different traffic conditions in the actual traffic network. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

17 pages, 7305 KiB  
Article
Discrete Wavelet Packet Transform-Based Industrial Digital Wireless Communication Systems
by Safa Saadaoui, Mohamed Tabaa, Fabrice Monteiro, Mouhamad Chehaitly and Abbas Dandache
Information 2019, 10(3), 104; https://doi.org/10.3390/info10030104 - 7 Mar 2019
Cited by 13 | Viewed by 4587
Abstract
The industrial internet of things (IIoT) known as industry 4.0, is the use of internet of things technologies, via the Wireless Sensor Network (WSN), to enhance manufacturing and industrial processes. It incorporates machine learning and big data technologies, to allow machine-to-machine communication that [...] Read more.
The industrial internet of things (IIoT) known as industry 4.0, is the use of internet of things technologies, via the Wireless Sensor Network (WSN), to enhance manufacturing and industrial processes. It incorporates machine learning and big data technologies, to allow machine-to-machine communication that have existed for years in the industrial world. Therefore, it is necessary to propose a robust and functional communication architecture that is based on WSNs, inside factories, in order to show the great interest in the connectivity of things in the industrial environment. In such environment, propagation differs from other conventional indoor mediums, in its large dimensions, and the nature of objects and obstacles inside. Thus, the industrial medium is modeled as a fading channel affected by an impulsive and Gaussian noise. The objective of this paper is to improve robustness and performances of multi-user WSN architecture, based on Discrete Wavelet Transform, under an industrial environment using conventional channel coding and an optimal thresholding receiver. Full article
(This article belongs to the Section Information and Communications Technology)
Show Figures

Figure 1

21 pages, 6398 KiB  
Article
A Hybrid Algorithm for Forecasting Financial Time Series Data Based on DBSCAN and SVR
by Mengxing Huang, Qili Bao, Yu Zhang and Wenlong Feng
Information 2019, 10(3), 103; https://doi.org/10.3390/info10030103 - 7 Mar 2019
Cited by 16 | Viewed by 4687
Abstract
Financial prediction is an important research field in financial data time series mining. There has always been a problem of clustering massive financial time series data. Conventional clustering algorithms are not practical for time series data because they are essentially designed for static [...] Read more.
Financial prediction is an important research field in financial data time series mining. There has always been a problem of clustering massive financial time series data. Conventional clustering algorithms are not practical for time series data because they are essentially designed for static data. This impracticality results in poor clustering accuracy in several financial forecasting models. In this paper, a new hybrid algorithm is proposed based on Optimization of Initial Points and Variable-Parameter Density-Based Spatial Clustering of Applications with Noise (OVDBCSAN) and support vector regression (SVR). At the initial point of optimization, ε and MinPts, which are global parameters in DBSCAN, mainly deal with datasets of different densities. According to different densities, appropriate parameters are selected for clustering through optimization. This algorithm can find a large number of similar classes and then establish regression prediction models. It was tested extensively using real-world time series datasets from Ping An Bank, the Shanghai Stock Exchange, and the Shenzhen Stock Exchange to evaluate accuracy. The evaluation showed that our approach has major potential in clustering massive financial time series data, therefore improving the accuracy of the prediction of stock prices and financial indexes. Full article
Show Figures

Figure 1

17 pages, 341 KiB  
Article
Related Stocks Selection with Data Collaboration Using Text Mining
by Masanori Hirano, Hiroki Sakaji, Shoko Kimura, Kiyoshi Izumi, Hiroyasu Matsushima, Shintaro Nagao and Atsuo Kato
Information 2019, 10(3), 102; https://doi.org/10.3390/info10030102 - 7 Mar 2019
Cited by 6 | Viewed by 4940
Abstract
We propose an extended scheme for selecting related stocks for themed mutual funds. This scheme was designed to support fund managers who are building themed mutual funds. In our preliminary experiments, building a themed mutual fund was found to be quite difficult. Our [...] Read more.
We propose an extended scheme for selecting related stocks for themed mutual funds. This scheme was designed to support fund managers who are building themed mutual funds. In our preliminary experiments, building a themed mutual fund was found to be quite difficult. Our scheme is a type of natural language processing method and based on words extracted according to their similarity to a theme using word2vec and our unique similarity based on co-occurrence in company information. We used data including investor relations and official websites as company information data. We also conducted several other experiments, including hyperparameter tuning, in our scheme. The scheme achieved a 172% higher F1 score and 21% higher accuracy than a standard method. Our research also showed the possibility that official websites are not necessary for our scheme, contrary to our preliminary experiments for assessing data collaboration. Full article
(This article belongs to the Special Issue MoDAT: Designing the Market of Data)
Show Figures

Figure 1

8 pages, 167 KiB  
Article
“Indirect” Information: The Debate on Testimony in Social Epistemology and Its Role in the Game of “Giving and Asking for Reasons”
by Raffaela Giovagnoli
Information 2019, 10(3), 101; https://doi.org/10.3390/info10030101 - 7 Mar 2019
Cited by 2 | Viewed by 3475
Abstract
We will sketch the debate on testimony in social epistemology by reference to the contemporary debate on reductionism/anti-reductionism, communitarian epistemology and inferentialism. Testimony is a fundamental source of knowledge we share and it is worthy to be considered in the ambit of a [...] Read more.
We will sketch the debate on testimony in social epistemology by reference to the contemporary debate on reductionism/anti-reductionism, communitarian epistemology and inferentialism. Testimony is a fundamental source of knowledge we share and it is worthy to be considered in the ambit of a dialogical perspective, which requires a description of a formal structure, which entails deontic statuses and deontic attitudes. In particular, we will argue for a social reformulation of the “space of reasons”, which establishes a fruitful relationship with the epistemological view of Wilfrid Sellars. Full article
18 pages, 2038 KiB  
Article
A Heuristic Elastic Particle Swarm Optimization Algorithm for Robot Path Planning
by Haiyan Wang and Zhiyu Zhou
Information 2019, 10(3), 99; https://doi.org/10.3390/info10030099 - 6 Mar 2019
Cited by 17 | Viewed by 4901
Abstract
Path planning, as the core of navigation control for mobile robots, has become the focus of research in the field of mobile robots. Various path planning algorithms have been recently proposed. In this paper, in view of the advantages and disadvantages of different [...] Read more.
Path planning, as the core of navigation control for mobile robots, has become the focus of research in the field of mobile robots. Various path planning algorithms have been recently proposed. In this paper, in view of the advantages and disadvantages of different path planning algorithms, a heuristic elastic particle swarm algorithm is proposed. Using the path planned by the A* algorithm in a large-scale grid for global guidance, the elastic particle swarm optimization algorithm uses a shrinking operation to determine the globally optimal path formed by locally optimal nodes so that the particles can converge to it rapidly. Furthermore, in the iterative process, the diversity of the particles is ensured by a rebound operation. Computer simulation and real experimental results show that the proposed algorithm not only overcomes the shortcomings of the A* algorithm, which cannot yield the shortest path, but also avoids the problem of failure to converge to the globally optimal path, owing to a lack of heuristic information. Additionally, the proposed algorithm maintains the simplicity and high efficiency of both the algorithms. Full article
Show Figures

Figure 1

20 pages, 942 KiB  
Article
Detecting Emotions in English and Arabic Tweets
by Tariq Ahmad, Allan Ramsay and Hanady Ahmed
Information 2019, 10(3), 98; https://doi.org/10.3390/info10030098 - 6 Mar 2019
Cited by 8 | Viewed by 4533
Abstract
Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such [...] Read more.
Assigning sentiment labels to documents is, at first sight, a standard multi-label classification task. Many approaches have been used for this task, but the current state-of-the-art solutions use deep neural networks (DNNs). As such, it seems likely that standard machine learning algorithms, such as these, will provide an effective approach. We describe an alternative approach, involving the use of probabilities to construct a weighted lexicon of sentiment terms, then modifying the lexicon and calculating optimal thresholds for each class. We show that this approach outperforms the use of DNNs and other standard algorithms. We believe that DNNs are not a universal panacea and that paying attention to the nature of the data that you are trying to learn from can be more important than trying out ever more powerful general purpose machine learning algorithms. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Show Figures

Figure 1

23 pages, 701 KiB  
Article
Glomerular Filtration Rate Estimation by a Novel Numerical Binning-Less Isotonic Statistical Bivariate Numerical Modeling Method
by Sebastian Nicolas Giles and Simone Fiori
Information 2019, 10(3), 100; https://doi.org/10.3390/info10030100 - 6 Mar 2019
Cited by 5 | Viewed by 4666
Abstract
Statistical bivariate numerical modeling is a method to infer an empirical relationship between unpaired sets of data based on statistical distributions matching. In the present paper, a novel efficient numerical algorithm is proposed to perform bivariate numerical modeling. The algorithm is then applied [...] Read more.
Statistical bivariate numerical modeling is a method to infer an empirical relationship between unpaired sets of data based on statistical distributions matching. In the present paper, a novel efficient numerical algorithm is proposed to perform bivariate numerical modeling. The algorithm is then applied to correlate glomerular filtration rate to serum creatinine concentration. Glomerular filtration rate is adopted in clinical nephrology as an indicator of kidney function and is relevant for assessing progression of renal disease. As direct measurement of glomerular filtration rate is highly impractical, there is considerable interest in developing numerical algorithms to estimate glomerular filtration rate from parameters which are easier to obtain, such as demographic and ‘bedside’ assays data. Full article
(This article belongs to the Special Issue eHealth and Artificial Intelligence)
Show Figures

Figure 1

14 pages, 412 KiB  
Article
Word Sense Disambiguation Studio: A Flexible System for WSD Feature Extraction
by Gennady Agre, Daniel Petrov and Simona Keskinova
Information 2019, 10(3), 97; https://doi.org/10.3390/info10030097 - 5 Mar 2019
Cited by 1 | Viewed by 6962
Abstract
The paper presents a flexible system for extracting features and creating training and test examples for solving the all-words sense disambiguation (WSD) task. The system allows integrating word and sense embeddings as part of an example description. The system possesses two unique features [...] Read more.
The paper presents a flexible system for extracting features and creating training and test examples for solving the all-words sense disambiguation (WSD) task. The system allows integrating word and sense embeddings as part of an example description. The system possesses two unique features distinguishing it from all similar WSD systems—the ability to construct a special compressed representation for word embeddings and the ability to construct training and test sets of examples with different data granularity. The first feature allows generation of data sets with quite small dimensionality, which can be used for training highly accurate classifiers of different types. The second feature allows generating sets of examples that can be used for training classifiers specialized in disambiguating a concrete word, words belonging to the same part-of-speech (POS) category or all open class words. Intensive experimentation has shown that classifiers trained on examples created by the system outperform the standard baselines for measuring the behaviour of all-words WSD classifiers. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Show Figures

Figure 1

26 pages, 5061 KiB  
Article
Hierarchical Clustering Approach for Selecting Representative Skylines
by Lkhagvadorj Battulga and Aziz Nasridinov
Information 2019, 10(3), 96; https://doi.org/10.3390/info10030096 - 5 Mar 2019
Viewed by 3741
Abstract
Recently, the skyline query has attracted interest in a wide range of applications from recommendation systems to computer networks. The skyline query is useful to obtain the dominant data points from the given dataset. In the low-dimensional dataset, the skyline query may return [...] Read more.
Recently, the skyline query has attracted interest in a wide range of applications from recommendation systems to computer networks. The skyline query is useful to obtain the dominant data points from the given dataset. In the low-dimensional dataset, the skyline query may return a small number of skyline points. However, as the dimensionality of the dataset increases, the number of skyline points also increases. In other words, depending on the data distribution and dimensionality, most of the data points may become skyline points. With the emergence of big data applications, where the data distribution and dimensionality are a significant problem, obtaining representative skyline points among resulting skyline points is necessary. There have been several methods that focused on extracting representative skyline points with various success. However, existing methods have a problem of re-computation when the global threshold changes. Moreover, in certain cases, the resulting representative skyline points may not satisfy a user with multiple preferences. Thus, in this paper, we propose a new representative skyline query processing method, called representative skyline cluster (RSC), which solves the problems of the existing methods. Our method utilizes the hierarchical agglomerative clustering method to find the exact representative skyline points, which enable us to reduce the re-computation time significantly. We show the superiority of our proposed method over the existing state-of-the-art methods with various types of experiments. Full article
Show Figures

Figure 1

21 pages, 5362 KiB  
Article
ByNowLife: A Novel Framework for OWL and Bayesian Network Integration
by Foni A. Setiawan, Eko K. Budiardjo and Wahyu C. Wibowo
Information 2019, 10(3), 95; https://doi.org/10.3390/info10030095 - 5 Mar 2019
Cited by 10 | Viewed by 3887
Abstract
An ontology-based system can currently logically reason through the Web Ontology Language Description Logic (OWL DL). To perform probabilistic reasoning, the system must use a separate knowledge base, separate processing, or third-party applications. Previous studies mainly focus on how to represent probabilistic information [...] Read more.
An ontology-based system can currently logically reason through the Web Ontology Language Description Logic (OWL DL). To perform probabilistic reasoning, the system must use a separate knowledge base, separate processing, or third-party applications. Previous studies mainly focus on how to represent probabilistic information in ontologies and perform reasoning through them. These approaches are not suitable for systems that already have running ontologies and Bayesian network (BN) knowledge bases because users must rewrite the probabilistic information contained in a BN into an ontology. We present a framework called ByNowLife, which is a novel approach for integrating BN with OWL by providing an interface for retrieving probabilistic information through SPARQL queries. ByNowLife catalyzes the integration process by transforming logical information contained in an ontology into a BN and probabilistic information contained in a BN into an ontology. This produces a system with a complete knowledge base. Using ByNowLife, a system that already has separate ontologies and BN knowledge bases can integrate them into a single knowledge base and perform both logical and probabilistic reasoning through it. The integration not only facilitates the unity of reasoning but also has several other advantages, such as ontology enrichment and BN structural adjustment through structural and parameter learning. Full article
Show Figures

Figure 1

19 pages, 368 KiB  
Article
A Multilingual and Multidomain Study on Dialog Act Recognition Using Character-Level Tokenization
by Eugénio Ribeiro, Ricardo Ribeiro and David Martins de Matos
Information 2019, 10(3), 94; https://doi.org/10.3390/info10030094 - 3 Mar 2019
Cited by 11 | Viewed by 4071
Abstract
Automatic dialog act recognition is an important step for dialog systems since it reveals the intention behind the words uttered by its conversational partners. Although most approaches on the task use word-level tokenization, there is information at the sub-word level that is related [...] Read more.
Automatic dialog act recognition is an important step for dialog systems since it reveals the intention behind the words uttered by its conversational partners. Although most approaches on the task use word-level tokenization, there is information at the sub-word level that is related to the function of the words and, consequently, their intention. Thus, in this study, we explored the use of character-level tokenization to capture that information. We explored the use of multiple character windows of different sizes to capture morphological aspects, such as affixes and lemmas, as well as inter-word information. Furthermore, we assessed the importance of punctuation and capitalization for the task. To broaden the conclusions of our study, we performed experiments on dialogs in three languages—English, Spanish, and German—which have different morphological characteristics. Furthermore, the dialogs cover multiple domains and are annotated with both domain-dependent and domain-independent dialog act labels. The achieved results not only show that the character-level approach leads to similar or better performance than the state-of-the-art word-level approaches on the task, but also that both approaches are able to capture complementary information. Thus, the best results are achieved by combining tokenization at both levels. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Show Figures

Figure 1

13 pages, 409 KiB  
Article
Application of Machine Learning Models for Survival Prognosis in Breast Cancer Studies
by Iliyan Mihaylov, Maria Nisheva and Dimitar Vassilev
Information 2019, 10(3), 93; https://doi.org/10.3390/info10030093 - 3 Mar 2019
Cited by 33 | Viewed by 7520
Abstract
The application of machine learning models for prediction and prognosis of disease development has become an irrevocable part of cancer studies aimed at improving the subsequent therapy and management of patients. The application of machine learning models for accurate prediction of survival time [...] Read more.
The application of machine learning models for prediction and prognosis of disease development has become an irrevocable part of cancer studies aimed at improving the subsequent therapy and management of patients. The application of machine learning models for accurate prediction of survival time in breast cancer on the basis of clinical data is the main objective of the presented study. The paper discusses an approach to the problem in which the main factor used to predict survival time is the originally developed tumor-integrated clinical feature, which combines tumor stage, tumor size, and age at diagnosis. Two datasets from corresponding breast cancer studies are united by applying a data integration approach based on horizontal and vertical integration by using proper document-oriented and graph databases which show good performance and no data losses. Aside from data normalization and classification, the applied machine learning methods provide promising results in terms of accuracy of survival time prediction. The analysis of our experiments shows an advantage of the linear Support Vector Regression, Lasso regression, Kernel Ridge regression, K-neighborhood regression, and Decision Tree regression—these models achieve most accurate survival prognosis results. The cross-validation for accuracy demonstrates best performance of the same models on the studied breast cancer data. As a support for the proposed approach, a Python-based workflow has been developed and the plans for its further improvement are finally discussed in the paper. Full article
(This article belongs to the Special Issue Artificial Intelligence—Methodology, Systems, and Applications)
Show Figures

Figure 1

25 pages, 6705 KiB  
Article
Analysis of SAP Log Data Based on Network Community Decomposition
by Martin Kopka and Miloš Kudělka
Information 2019, 10(3), 92; https://doi.org/10.3390/info10030092 - 1 Mar 2019
Cited by 3 | Viewed by 4577
Abstract
Information systems support and ensure the practical running of the most critical business processes. There exists (or can be reconstructed) a record (log) of the process running in the information system. Computer methods of data mining can be used for analysis of process [...] Read more.
Information systems support and ensure the practical running of the most critical business processes. There exists (or can be reconstructed) a record (log) of the process running in the information system. Computer methods of data mining can be used for analysis of process data utilizing support techniques of machine learning and a complex network analysis. The analysis is usually provided based on quantitative parameters of the running process of the information system. It is not so usual to analyze behavior of the participants of the running process from the process log. Here, we show how data and process mining methods can be used for analyzing the running process and how participants behavior can be analyzed from the process log using network (community or cluster) analyses in the constructed complex network from the SAP business process log. This approach constructs a complex network from the process log in a given context and then finds communities or patterns in this network. Found communities or patterns are analyzed using knowledge of the business process and the environment in which the process operates. The results demonstrate the possibility to cover up not only the quantitative but also the qualitative relations (e.g., hidden behavior of participants) using the process log and specific knowledge of the business case. Full article
(This article belongs to the Special Issue Computational Social Science)
Show Figures

Figure 1

13 pages, 4872 KiB  
Article
An Implementation of Parallel Buck Converters for Common Load Sharing in DC Microgrid
by Sikander Ali, Tang Shengxue, Zhang Jianyu, Ahmad Ali and Arshad Nawaz
Information 2019, 10(3), 91; https://doi.org/10.3390/info10030091 - 1 Mar 2019
Cited by 9 | Viewed by 4386
Abstract
The increase in demand for clean, safe, and environmentally friendly renewable energy sources faces several challenges such as system design and reliable operations. DC microgrid (MG) is a promising system due to higher efficiency and natural interface to renewable sources. In the hierarchical [...] Read more.
The increase in demand for clean, safe, and environmentally friendly renewable energy sources faces several challenges such as system design and reliable operations. DC microgrid (MG) is a promising system due to higher efficiency and natural interface to renewable sources. In the hierarchical control of DC Microgrid, the V-I droop control is deployed usually in primary control level for common load sharing between converters. However, conventional droop control causes improper current sharing, voltage variations, and circulating current regulation due to the presence of droop and line resistance between converters. The aim of this paper is to presents the primary control level design of buck converters in current mode control according to the concepts of time constant and time delay, and secondary control design for parallel operations in distributed manners by combining methods, namely, low bandwidth communication (LBC), circulating current minimization techniques, and average voltage/current control. Moreover, different time delays are used for two converters to testify the effects of communication delays on current sharing and voltage restoration. The simulation is done for 2 × 2.5 KWdc parallel buck converters in PLECS (a Simulation software used for high speed simulation for power electronics) environment which shows excellent results in minimizing circulation currents, enhancing proportional current sharing, and restoring the grid voltage. Full article
Show Figures

Figure 1

11 pages, 1037 KiB  
Article
A Fast Optimization Algorithm of FEM/BEM Simulation for Periodic Surface Acoustic Wave Structures
by Honglang Li, Zixiao Lu, Yabing Ke, Yahui Tian and Wei Luo
Information 2019, 10(3), 90; https://doi.org/10.3390/info10030090 - 28 Feb 2019
Cited by 5 | Viewed by 3835
Abstract
The accurate analysis of periodic surface acoustic wave (SAW) structures by combined finite element method and boundary element method (FEM/BEM) is important for SAW design, especially in the extraction of couple-of-mode (COM) parameters. However, the time cost is very large. With the aim [...] Read more.
The accurate analysis of periodic surface acoustic wave (SAW) structures by combined finite element method and boundary element method (FEM/BEM) is important for SAW design, especially in the extraction of couple-of-mode (COM) parameters. However, the time cost is very large. With the aim to accelerate the calculation of SAW FEM/BEM analysis, some optimization algorithms for the FEM and BEM calculation have been reported, while the optimization for the solution to the final FEM/BEM equations which is also with a large amount of calculation is hardly reported. In this paper, it was observed that the coefficient matrix of the final FEM/BEM equations for the periodic SAW structures was similar to a Toeplitz matrix. A fast algorithm based on the Trench recursive algorithm for the Toeplitz matrix inversion was proposed to speed up the solution of the final FEM/BEM equations. The result showed that both the time and memory cost of FEM/BEM was reduced furtherly. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop