Next Issue
Volume 4, June
Previous Issue
Volume 3, December
 
 

Data, Volume 4, Issue 1 (March 2019) – 46 articles

Cover Story (view full-size image): European cities and communities (and beyond) require a structured overview and a set of tools as to achieve a sustainable transformation towards smarter cities/municipalities, thereby leveraging on the enormous potential of the emerging data driven economy. The authors propose the concept of an Urban Data Space (UDS), which facilitates an eco-system for data exchange and added value creation thereby utilizing the various types of data within a smart city/municipality. Furthermore, the Urban Data Space is described/analyzed in detail, and relevant stakeholders are identified, as well as corresponding technical artifacts are introduced. The authors propose to setup Urban Data Spaces based on emerging standards from the area of ICT reference architectures for Smart Cities. View this paper.
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
6 pages, 1238 KiB  
Data Descriptor
Removal of Positive Elevation Bias of Digital Elevation Models for Sea-Level Rise Planning
by Elizabeth Burke Watson, LeeAnn Haaf, Kirk Raper and Erin Reilly
Cited by 3 | Viewed by 2928
Abstract
Digital elevation models (DEMs) based on LiDAR surveys provide critical information for predicting the vulnerability of coastal areas to sea-level rises. Due to the poor penetration of LiDAR pulses in marsh vegetation, bare-earth DEMs for coastal wetlands are often subject to positive elevation [...] Read more.
Digital elevation models (DEMs) based on LiDAR surveys provide critical information for predicting the vulnerability of coastal areas to sea-level rises. Due to the poor penetration of LiDAR pulses in marsh vegetation, bare-earth DEMs for coastal wetlands are often subject to positive elevation bias, and thus underestimate vulnerability. This data publication includes comprehensive elevation surveys from seven coastal wetlands in coastal New Jersey, and an evaluation of the accuracy and positive elevation bias of each publically available DEM. Resampling the DEMs at a coarser resolution, replacing cell values using the minimum value in a wider search window (4 m), removed this positive elevation bias with no loss of accuracy. Full article
Show Figures

Figure 1

13 pages, 2895 KiB  
Article
T4SS Effector Protein Prediction with Deep Learning
by Koray Açıcı, Tunç Aşuroğlu, Çağatay Berke Erdaş and Hasan Oğul
Cited by 12 | Viewed by 4395
Abstract
Extensive research has been carried out on bacterial secretion systems, as they can pass effector proteins directly into the cytoplasm of host cells. The correct prediction of type IV protein effectors secreted by T4SS is important, since they are known to play a [...] Read more.
Extensive research has been carried out on bacterial secretion systems, as they can pass effector proteins directly into the cytoplasm of host cells. The correct prediction of type IV protein effectors secreted by T4SS is important, since they are known to play a noteworthy role in various human pathogens. Studies on predicting T4SS effectors involve traditional machine learning algorithms. In this work we included a deep learning architecture, i.e., a Convolutional Neural Network (CNN), to predict IVA and IVB effectors. Three feature extraction methods were utilized to represent each protein as an image and these images fed the CNN as inputs in our proposed framework. Pseudo proteins were generated using ADASYN algorithm to overcome the imbalanced dataset problem. We demonstrated that our framework predicted all IVA effectors correctly. In addition, the sensitivity performance of 94.2% for IVB effector prediction exhibited our framework’s ability to discern the effectors in unidentified proteins. Full article
Show Figures

Figure 1

15 pages, 2871 KiB  
Article
Diagnosis of Intermittently Faulty Units at System Level
by Viktor Mashkov, Jirí Fiser, Volodymyr Lytvynenko and Maria Voronenko
Cited by 2 | Viewed by 3091
Abstract
Mostly, diagnosis at a system level intends to identify only permanently faulty units. In the paper, we consider the case when both permanently and intermittently faulty units can occur in the system. Identification of intermittently faulty units has some specifics which we have [...] Read more.
Mostly, diagnosis at a system level intends to identify only permanently faulty units. In the paper, we consider the case when both permanently and intermittently faulty units can occur in the system. Identification of intermittently faulty units has some specifics which we have considered in this paper. We also suggest the method which allows for distinguishing among different types of intermittent faults. A diagnosis procedure was suggested for each type of intermittent fault. Full article
(This article belongs to the Special Issue Data Stream Mining and Processing)
Show Figures

Figure 1

11 pages, 885 KiB  
Data Descriptor
The Importance of Measuring Students’ Opinions and Attitudes
by E. S. Sanz-Pérez
Cited by 1 | Viewed by 2655
Abstract
The data presented in this article are related to a research carried out at the University Rey Juan Carlos in Spain. Chemical Engineering taught as a subject across three Energy Engineering-based degree streams was evaluated for two academic years. Student insight on course [...] Read more.
The data presented in this article are related to a research carried out at the University Rey Juan Carlos in Spain. Chemical Engineering taught as a subject across three Energy Engineering-based degree streams was evaluated for two academic years. Student insight on course development, their own expectations and results, and the evaluation system were explored via a 33-item survey, receiving 47 full responses. The present contribution provides the full responses obtained from students to the survey administered. The received data were studied applying thorough statistical analyses used to infer conclusions. The full set of data are made public here independently from the research article. Full article
Show Figures

Graphical abstract

12 pages, 7193 KiB  
Article
Evaluation of Photogrammetry and Inclusion of Control Points: Significance for Infrastructure Monitoring
by Renee C. Oats, Rudiger Escobar-Wolf and Thomas Oommen
Cited by 7 | Viewed by 4231
Abstract
Structure from Motion (SfM)/Photogrammetry is a powerful mapping tool in extracting three-dimensional (3D) models from photographs. This method has been applied to a range of applications, including monitoring of infrastructure systems. This technique could potentially become a substitute, or at least a complement, [...] Read more.
Structure from Motion (SfM)/Photogrammetry is a powerful mapping tool in extracting three-dimensional (3D) models from photographs. This method has been applied to a range of applications, including monitoring of infrastructure systems. This technique could potentially become a substitute, or at least a complement, for costlier approaches such as laser scanning for infrastructure monitoring. This study expands on previous investigations, which utilize photogrammetry point cloud data to measure failure mode behavior of a retaining wall model, emphasizing further robust spatial testing. In this study, a comparison of two commonly used photogrammetry software packages was implemented to assess the computing performance of the method and the significance of control points in this approach. The impact of control point selection, as part of the photogrammetric modeling processes, was also evaluated. Comparisons between the two software tools reveal similar performances in capturing quantitative changes of a retaining wall structure. Results also demonstrate that increasing the number of control points above a certain number does not, necessarily, increase 3D modeling accuracies, but, in some cases, their spatial distribution can be more critical. Furthermore, errors in model reproducibility, when compared with total station measurements, were found to be spatially correlated with the arrangement of control points. Full article
Show Figures

Figure 1

11 pages, 1338 KiB  
Data Descriptor
A High-Resolution Global Gridded Historical Dataset of Climate Extreme Indices
by Malcolm N. Mistry
Cited by 37 | Viewed by 11584
Abstract
Climate extreme indices (CEIs) are important metrics that not only assist in the analysis of regional and global extremes in meteorological events, but also aid climate modellers and policymakers in the assessment of sectoral impacts. Global high-spatial-resolution CEI datasets derived from quality-controlled historical [...] Read more.
Climate extreme indices (CEIs) are important metrics that not only assist in the analysis of regional and global extremes in meteorological events, but also aid climate modellers and policymakers in the assessment of sectoral impacts. Global high-spatial-resolution CEI datasets derived from quality-controlled historical observations, or reanalysis data products are scarce. This study introduces a new high-resolution global gridded dataset of CEIs based on sub-daily temperature and precipitation data from the Global Land Data Assimilation System (GLDAS). The dataset called “CEI_0p25_1970_2016” includes 71 annual (and in some cases monthly) CEIs at 0.25 × 0.25 gridded resolution, covering 47 years over the period 1970–2016. The data of individual indices are publicly available for download in the commonly used Network Common Data Form 4 (NetCDF4) format. Potential applications of CEI_0p25_1970_2016 presented here include the assessment of sectoral impacts (e.g., Agriculture, Health, Energy, and Hydrology), as well as the identification of hot spots (clusters) showing similar historical spatial patterns of high/low temperature and precipitation extremes. CEI_0p25_1970_2016 fills gaps in existing CEI datasets by encompassing not only more indices, but also by being the only comprehensive global gridded CEI data available at high spatial resolution. Full article
(This article belongs to the Special Issue Overcoming Data Scarcity in Earth Science)
Show Figures

Figure 1

17 pages, 2719 KiB  
Article
LNSNet: Lightweight Navigable Space Segmentation for Autonomous Robots on Construction Sites
by Khashayar Asadi, Pengyu Chen, Kevin Han, Tianfu Wu and Edgar Lobaton
Cited by 8 | Viewed by 4912
Abstract
An autonomous robot that can monitor a construction site should be able to be can contextually detect its surrounding environment by recognizing objects and making decisions based on its observation. Pixel-wise semantic segmentation in real-time is vital to building an autonomous and mobile [...] Read more.
An autonomous robot that can monitor a construction site should be able to be can contextually detect its surrounding environment by recognizing objects and making decisions based on its observation. Pixel-wise semantic segmentation in real-time is vital to building an autonomous and mobile robot. However, the learning models’ size and high memory usage associated with real-time segmentation are the main challenges for mobile robotics systems that have limited computing resources. To overcome these challenges, this paper presents an efficient semantic segmentation method named LNSNet (lightweight navigable space segmentation network) that can run on embedded platforms to determine navigable space in real-time. The core of model architecture is a new block based on separable convolution which compresses the parameters of present residual block meanwhile maintaining the accuracy and performance. LNSNet is faster, has fewer parameters and less model size, while provides similar accuracy compared to existing models. A new pixel-level annotated dataset for real-time and mobile navigable space segmentation in construction environments has been constructed for the proposed method. The results demonstrate the effectiveness and efficiency that are necessary for the future development of the autonomous robotics systems. Full article
Show Figures

Figure 1

9 pages, 23376 KiB  
Data Descriptor
Autonomous “Figure-8” Flights of a Quadcopter: Experimental Datasets
by Srikanth Gururajan and Ye Bai
Cited by 3 | Viewed by 4412
Abstract
This article describes the data acquired from multiple flights of a custom-built quadcopter. The Quadcopter was programmed to fly a pre-defined “Figure-8” flight path, at a constant altitude. The data set includes flights with a varying number of waypoints (10 and 15 waypoints [...] Read more.
This article describes the data acquired from multiple flights of a custom-built quadcopter. The Quadcopter was programmed to fly a pre-defined “Figure-8” flight path, at a constant altitude. The data set includes flights with a varying number of waypoints (10 and 15 waypoints in each lobe of the “Figure-8”) and at two different velocities (1.5 and 2.5 m/s). The data also contains information on the output of the flight controller in terms of the Pulse Width Modulation (PWM) signals to each of the four Electronic Speed Controllers (ESC) driving the motors, the recorded outputs of the Inertial Measurement Unit (linear accelerations ax, ay, az and angular velocities p, q, r), GPS data (Latitude, Longitude, altitude, Horizontal Dilution of Precision (HDOP) and Vertical Dilution of Precision (VDOP). The data are included as Supplemental Material. Full article
Show Figures

Figure 1

3 pages, 176 KiB  
Data Descriptor
Growth Analysis and Nutrient Solution Management of a Soil-Less Tomato Crop in a Mediterranean Environment
by Angelo Signore, Francesco Serio and Pietro Santamaria
Cited by 1 | Viewed by 2820
Abstract
The data contained in this article are strictly related to our previous article titled “A Targeted Management of the Nutrient Solution in a Soilless Tomato Crop According to Plant Needs” (Signore, A. et al., 2016). The detailed datasets regards the amount of dry [...] Read more.
The data contained in this article are strictly related to our previous article titled “A Targeted Management of the Nutrient Solution in a Soilless Tomato Crop According to Plant Needs” (Signore, A. et al., 2016). The detailed datasets regards the amount of dry matter (Table 1), the nutrient solution consumption (Table 2) and the mineral composition of plant tissues (Tables 3–7) in a soil-less tomato crop. The information contained in this article are necessary since, unlike the northern European countries, such data are generally missing for the crops in the Mediterranean environment. By correlating the parameters reported above, we were able to provide a more precise management of the nutrient solution, by providing the correct nutrient concentration into the nutrient solution in function of (i) the volume of water absorbed, (ii) the growth rate and (iii) the nutrient concentration in tomato plant. Finally, the more precise management of the nutrient solution allowed discharging a lesser amount of water and nutrients into the environment, improving the sustainability of the crop. Full article
7 pages, 2581 KiB  
Data Descriptor
The Spatial and Temporal Distribution of Process Gases within the Biowaste Compost
by Sylwia Stegenta, Karolina Sobieraj, Grzegorz Pilarski, Jacek A. Koziel and Andrzej Białowiec
Cited by 12 | Viewed by 3855
Abstract
Composting is generally accepted as the sustainable recycling of biowaste into a useful and beneficial product for soil. However, composting processes can produce gases that are considered air pollutants. In this dataset, we summarized the spatial and temporal distribution of process gases (including [...] Read more.
Composting is generally accepted as the sustainable recycling of biowaste into a useful and beneficial product for soil. However, composting processes can produce gases that are considered air pollutants. In this dataset, we summarized the spatial and temporal distribution of process gases (including rarely reported carbon monoxide, CO) generated inside full-scale composting piles. In total 1375 cross-sections were made and presented in 230 figures. The research aimed to investigate the phenomenon of gas evolution during the composting of biowaste depending on the pile turning regime (no turning, turning once a week, and turning twice a week) and pile location (outdoors, and indoors in a composting hall). The analyzed biowaste (a mixture of tree leaves and branches, grass clippings, and sewage sludge) were composted in six piles with passive aeration including additional turning at a municipal composting plant. The chemical composition and temperature of process gases within each pile were analyzed weekly for ~49–56 days. The variations in the degree of pile aeration (O2 content), temperature, and the spatial distribution of CO, CO2 and NO concentration during the subsequent measurement cycles were summarized and visualized. The lowest O2 concentrations were associated with the central (core) part of the pile. Similarly, an increase in CO content in the pile core sections was found, which may indicate that CO is oxidized in the upper layer of composting piles. Higher CO and CO2 concentrations and temperature were also observed in the summer season, especially on the south side of piles located outdoors. The most varied results were for the NO concentrations that occurred in all conditions. The dataset was used by the composting plant operator for more sustainable management. Specifically, the dataset allowed us to make recommendations to minimize the environmental impact of composting operations and to lower the risk of worker exposure to CO. The new procedure is as follows: turning of biowaste twice a week for the first two weeks, followed by turning once a week for the next two weeks. Turning is not necessary after four weeks of the process. The recommended surface-to-volume ratio of a compost pile should not exceed 2.5. Compost piles should be constructed with a surface-to-volume ratio of less than 2 in autumn and early spring when low ambient temperatures are common. Full article
Show Figures

Figure 1

8 pages, 277 KiB  
Data Descriptor
Resazurin Assay Data for Mycobacterium tuberculosis Supporting a Model of the Growth Accelerated by a Stochastic Non-Homogeneity
by Eugene B. Postnikov, Andrey A. Khalin, Anastasia I. Lavrova and Olga A. Manicheva
Cited by 1 | Viewed by 3154
Abstract
Tuberculosis is one of the most widespread worldwide diseases heavily affecting society. Among popular modern laboratory tests for mycobacterial growth, the resazurin assay has certain advantages due to its effectiveness and relatively low cost. However, the high heterogeneity of the mycobacterial population affects [...] Read more.
Tuberculosis is one of the most widespread worldwide diseases heavily affecting society. Among popular modern laboratory tests for mycobacterial growth, the resazurin assay has certain advantages due to its effectiveness and relatively low cost. However, the high heterogeneity of the mycobacterial population affects the average growth rate. This fact must be taken into account in a quantitative interpretation of these tests’ output—fluorescence growth curves—related to the population growth of viable mycobacteria. Here, we report the spectrophotometric data obtained via the resazurin assay for the standard reference strain of Mycobacterium tuberculosis H37Rv for different initial dilutions and generation numbers of the culture, as well as their primary processing from the point of view of the stochastic multiplicative growth model. The obtained data, which indicate an accelerated (instead of linear) growth of the population density logarithm between the end of the lag phase and the saturation, provide evidence of the importance of the growth rates’ stochasticity. An analysis of the curve fits resulted in an estimation of the first two moments of the growth rates’ probability distributions, showing its relevance to vital processes for mycobacterial culture. Full article
Show Figures

Graphical abstract

16 pages, 8603 KiB  
Data Descriptor
Urbanization in India: Population and Urban Classification Grids for 2011
by Deborah Balk, Mark R. Montgomery, Hasim Engin, Natalie Lin, Elizabeth Major and Bryan Jones
Cited by 33 | Viewed by 20355
Abstract
India is the world’s most populous country, yet also one of the least urban. It has long been known that India’s official estimates of urban percentages conflict with estimates derived from alternative conceptions of urbanization. To date, however, the detailed spatial and settlement [...] Read more.
India is the world’s most populous country, yet also one of the least urban. It has long been known that India’s official estimates of urban percentages conflict with estimates derived from alternative conceptions of urbanization. To date, however, the detailed spatial and settlement boundary data needed to analyze and reconcile these differences have not been available. This paper presents gridded estimates of population at a resolution of 1 km along with two spatial renderings of urban areas—one based on the official tabulations of population and settlement types (i.e., statutory towns, outgrowths, and census towns) and the other on remotely-sensed measures of built-up land derived from the Global Human Settlement Layer. We also cross-classified the census data and the remotely-sensed data to construct a hybrid representation of the continuum of urban settlement. In their spatial detail, these materials go well beyond what has previously been available in the public domain, and thereby provide an empirical basis for comparison among competing conceptual models of urbanization. Full article
Show Figures

Figure 1

17 pages, 2581 KiB  
Data Descriptor
Graphing Ecotoxicology: The MAGIC Graph for Linking Environmental Data on Chemicals
by Sascha Bub, Jakob Wolfram, Sebastian Stehle, Lara L. Petschick and Ralf Schulz
Cited by 6 | Viewed by 5413
Abstract
Assessing the impact of chemicals on the environment and addressing subsequent issues are two central challenges to their safe use. Environmental data are continuously expanding, requiring flexible, scalable, and extendable data management solutions that can harmonize multiple data sources with potentially differing nomenclatures [...] Read more.
Assessing the impact of chemicals on the environment and addressing subsequent issues are two central challenges to their safe use. Environmental data are continuously expanding, requiring flexible, scalable, and extendable data management solutions that can harmonize multiple data sources with potentially differing nomenclatures or levels of specificity. Here, we present the methodological steps taken to construct a rule-based labeled property graph database, the “Meta-analysis of the Global Impact of Chemicals” (MAGIC) graph, for potential environmental impact chemicals (PEIC) and its subsequent application harmonizing multiple large-scale databases. The resulting data encompass 16,739 unique PEICs attributed to their corresponding chemical class, stereo-chemical information, valid synonyms, use types, unique identifiers (e.g., Chemical Abstract Service registry number CAS RN), and others. These data provide researchers with additional chemical information for a large amount of PEICs and can also be publicly accessed using a web interface. Our analysis has shown that data harmonization can increase up to 98% when using the MAGIC graph approach compared to relational data systems for datasets with different nomenclatures. The graph database system and its data appear more suitable for large-scale analysis where traditional (i.e., relational) data systems are reaching conceptional limitations. Full article
Show Figures

Figure 1

8 pages, 2166 KiB  
Data Descriptor
Genome Analysis of the Marine Bacterium Labrenzia sp. Strain 011, a Potential Protective Agent of Mollusks
by Jamshid Amiri Moghaddam, Antonio Dávila-Céspedes, Mohammad Alanjary, Jochen Blom, Gabriele M. König and Till F. Schäberle
Cited by 2 | Viewed by 4105
Abstract
The marine bacterium Labrenzia sp. strain 011 was isolated from the coastal sediment of Kronsgaard, Germany. The Labrenzia species are suggested to be protective agents of mollusks. Labrenzia sp. strain 011 produces specialized metabolites, which showed activity against a range of microorganisms, thereunder [...] Read more.
The marine bacterium Labrenzia sp. strain 011 was isolated from the coastal sediment of Kronsgaard, Germany. The Labrenzia species are suggested to be protective agents of mollusks. Labrenzia sp. strain 011 produces specialized metabolites, which showed activity against a range of microorganisms, thereunder strong inhibitory effects against Pseudoroseovarius crassostreae DSM 16,950 (genus Roseovarius), the causative agent of oyster disease. The genome of Labrenzia sp. strain 011 was sequenced and assembled into 65 contigs, has a size of 5.1 Mbp, and a G+C content of 61.6%. A comparative genome analysis defined Labrenzia sp. strain 011 as a distinct new species within the genus Labrenzia, whereby 44% of the genome was contributed to the Labrenzia core genome. The genomic data provided here is expected to contribute to a deeper understanding of the mollusk-protective role of Labrenzia spp. Full article
Show Figures

Figure 1

21 pages, 605 KiB  
Article
Need for Standardization and Systematization of Test Data for Job-Shop Scheduling
by Edzard Weber, Anselm Tiefenbacher and Norbert Gronau
Cited by 3 | Viewed by 3374
Abstract
The development of new and better optimization and approximation methods for Job Shop Scheduling Problems (JSP) uses simulations to compare their performance. The test data required for this has an uncertain influence on the simulation results, because the feasable search space can be [...] Read more.
The development of new and better optimization and approximation methods for Job Shop Scheduling Problems (JSP) uses simulations to compare their performance. The test data required for this has an uncertain influence on the simulation results, because the feasable search space can be changed drastically by small variations of the initial problem model. Methods could benefit from this to varying degrees. This speaks in favor of defining standardized and reusable test data for JSP problem classes, which in turn requires a systematic describability of the test data in order to be able to compile problem adequate data sets. This article looks at the test data used for comparing methods by literature review. It also shows how and why the differences in test data have to be taken into account. From this, corresponding challenges are derived which the management of test data must face in the context of JSP research. Full article
Show Figures

Figure 1

17 pages, 2398 KiB  
Data Descriptor
Immunomics Datasets and Tools: To Identify Potential Epitope Segments for Designing Chimeric Vaccine Candidate to Cervix Papilloma
by Satyavani Kaliamurthi, Gurudeeban Selvaraj, Sathishkumar Chinnasamy, Qiankun Wang, Asma Sindhoo Nangraj, William C. Cho, Keren Gu and Dong-Qing Wei
Cited by 6 | Viewed by 3927
Abstract
Immunomics tools and databases play an important role in the designing of prophylactic or therapeutic vaccines against pathogenic bacteria and viruses. Therefore, we aimed to illustrate the different immunological databases and web servers used to design a chimeric vaccine candidate against human cervix [...] Read more.
Immunomics tools and databases play an important role in the designing of prophylactic or therapeutic vaccines against pathogenic bacteria and viruses. Therefore, we aimed to illustrate the different immunological databases and web servers used to design a chimeric vaccine candidate against human cervix papilloma. Initially, cellular immunity inducing major histocompatibility complex class I and II epitopes from L2 protein of papilloma 58 strain were predicted using the IEDB, NetMHC, and Tepi tools. Then, the overlapped segments from the above analysis were used to calculate efficiency on interferon-gamma and humoral immunity production. In addition, the allergenicity, antigenicity, cross-reactivity with human proteomes, and epitope conservancy of elite segments were determined. The chimeric vaccine candidate (SGD58) was constructed with two different overlapped peptide segments (23–36) and (29–42), adjuvants (flagellin and RS09), two Th epitopes, and amino acid linkers. The results of homology modeling demonstrated that SGD58 have 88.6% of favored regions based on Ramachandran plot. Protein–protein docking with Swarm Dock reveals SGD58 with receptor complex have −54.74 kcal/mol of binding energy with more than 20 interacting residues. Docked complex are stable in 100ns of molecular dynamic simulation. Further, coding sequences of SGD58 also show elevated gene expression in E. coli. In conclusion, SGD58 may prompt vaccine against cervix papilloma. This study provides insight of vaccine design against different pathogenic microbes as well. Full article
Show Figures

Figure 1

18 pages, 5287 KiB  
Data Descriptor
The Historical Small Smart City Protocol (HISMACITY): Toward an Intelligent Tool Using Geo Big Data for the Sustainable Management of Minor Historical Assets
by Valentina Pica, Alessandro Cecili, Stefania Annicchiarico and Elena Volkova
Cited by 5 | Viewed by 3726
Abstract
This research reports the ongoing design of the HISMACITY (Historical Small Smart City) Protocol, a planning tool with a certification system. The tool is designed for small municipalities in Europe. Through the award-winning certification system, the Protocol supports the fulfillment of best practices. [...] Read more.
This research reports the ongoing design of the HISMACITY (Historical Small Smart City) Protocol, a planning tool with a certification system. The tool is designed for small municipalities in Europe. Through the award-winning certification system, the Protocol supports the fulfillment of best practices. Such practices can enhance town attractiveness. It also counteracts excessive land use that results from urban growth, and reduces demographic decline in internal areas of each country. The research methodology is grounded on building a dynamic dataset using geo big data, local data, and mobile data via information communications technology (ICT), and real-time data through sensors. The tool aims to build algorithms to calculate indicators that measure quality standards of integrated interventions. The aim is to reach specific goals within defined priority areas of the Historical Small Smart City Protocol. Being highly adaptive, the framework follows urban responsive design principles based on weighted suitability models that can be calibrated by changing the input data and the weights of the linear combination formula. The results highlight varying framework data, including the tool’s development procedures and practicality. Full article
(This article belongs to the Special Issue Big Data Challenges in Smart Cities)
Show Figures

Graphical abstract

15 pages, 4784 KiB  
Data Descriptor
Spatial Distribution of Wind Turbines, Photovoltaic Field Systems, Bioenergy, and River Hydro Power Plants in Germany
by Marcus Eichhorn, Mattes Scheftelowitz, Matthias Reichmuth, Christian Lorenz, Kyriakos Louca, Alexander Schiffler, Rita Keuneke, Martin Bauschmann, Jens Ponitka, David Manske and Daniela Thrän
Cited by 21 | Viewed by 9130
Abstract
The expansion of renewable energy technologies, accompanied by an increasingly decentralized supply structure, raises many research questions regarding the structure, dimension, and impacts of the electricity supply network. In this context, information on renewable energy plants, particularly their spatial distribution and key parameters—e.g., [...] Read more.
The expansion of renewable energy technologies, accompanied by an increasingly decentralized supply structure, raises many research questions regarding the structure, dimension, and impacts of the electricity supply network. In this context, information on renewable energy plants, particularly their spatial distribution and key parameters—e.g., installed capacity, total size, and required space—are more and more important for public decision makers and different scientific domains, such as energy system analysis and impact assessment. The dataset described in this paper covers the spatial distribution, installed capacity, and commissioning year of wind turbines, photovoltaic field systems, and bio- and river hydro power plants in Germany. Collected from different online sources and authorities, the data have been thoroughly cross-checked, cleaned, and merged to generate validated and complete datasets. The paper concludes with notes on the practical use of the dataset in an environmental impact monitoring framework and other potential research or policy settings. Full article
Show Figures

Figure 1

18 pages, 4948 KiB  
Article
Vehicular Ad Hoc Network (VANET) Connectivity Analysis of a Highway Toll Plaza
by Saajid Hussain, Di Wu, Sheeba Memon and Naadiya Khuda Bux
Cited by 14 | Viewed by 4974
Abstract
The aim of this paper was to study issues of network connectivity in vehicular ad hoc networks (VANETs) to avoid traffic congestion at a toll plaza. An analytical model was developed for highway scenarios where the traffic congestion could have the vehicles reduce [...] Read more.
The aim of this paper was to study issues of network connectivity in vehicular ad hoc networks (VANETs) to avoid traffic congestion at a toll plaza. An analytical model was developed for highway scenarios where the traffic congestion could have the vehicles reduce their speed instead of blocking the flow of traffic. In this model, nearby vehicles must be informed when traffic congestion occurs before reaching the toll plaza so they can reduce their speed in order to avoid traffic congestion. Once they have crossed the toll plaza they can travel on at their normal speed. The road was divided into two or three sub-segments to help analyze the performance of connectivity. The proposed analytical model considered various parameters that might disturb the connectivity probability, including traveling speed, communication range of vehicles, vehicle arrival rate, and road length. The simulation results matched those of the analytical model, which showed the analytical model developed in this paper is effective. Full article
Show Figures

Figure 1

13 pages, 1349 KiB  
Data Descriptor
A Uniform In Vitro Efficacy Dataset to Guide Antimicrobial Peptide Design
by Deepesh Nagarajan, Tushar Nagarajan, Neha Nanajkar and Nagasuma Chandra
Cited by 7 | Viewed by 3902
Abstract
Antimicrobial peptides are ubiquitous molecules that form the innate immune system of organisms across all kingdoms of life. Despite their prevalence and early origins, they continue to remain potent natural antimicrobial agents. Antimicrobial peptides are therefore promising drug candidates in the face of [...] Read more.
Antimicrobial peptides are ubiquitous molecules that form the innate immune system of organisms across all kingdoms of life. Despite their prevalence and early origins, they continue to remain potent natural antimicrobial agents. Antimicrobial peptides are therefore promising drug candidates in the face of overwhelming multi-drug resistance to conventional antibiotics. Over the past few decades, thousands of antimicrobial peptides have been characterized in vitro, and their efficacy data are now available in a multitude of public databases. Computational antimicrobial peptide design attempts typically use such data. However, utilizing heterogenous data aggregated from different sources presents significant drawbacks. In this report, we present a uniform dataset containing 20 antimicrobial peptides assayed against 30 organisms of Gram-negative, Gram-positive, mycobacterial, and fungal origin. We also present circular dichroism spectra for all antimicrobial peptides. We draw simple inferences from this data, and we discuss what characteristics are essential for antimicrobial peptide efficacy. We expect our uniform dataset to be useful for future projects involving computational antimicrobial peptide design. Full article
Show Figures

Figure 1

10 pages, 3470 KiB  
Data Descriptor
A Dataset for Comparing Mirrored and Non-Mirrored Male Bust Images for Facial Recognition
by Collin Gros and Jeremy Straub
Cited by 1 | Viewed by 3526
Abstract
Facial recognition, as well as other types of human recognition, have found uses in identification, security, and learning about behavior, among other uses. Because of the high cost of data collection for training purposes, logistical challenges and other impediments, mirroring images has frequently [...] Read more.
Facial recognition, as well as other types of human recognition, have found uses in identification, security, and learning about behavior, among other uses. Because of the high cost of data collection for training purposes, logistical challenges and other impediments, mirroring images has frequently been used to increase the size of data sets. However, while these larger data sets have shown to be beneficial, their comparative level of benefit to the data collection of similar data has not been assessed. This paper presented a data set collected and prepared for this and related research purposes. The data set included both non-occluded and occluded data for mirroring assessment. Full article
Show Figures

Figure 1

26 pages, 2299 KiB  
Article
Innovating Metrics for Smarter, Responsive Cities
by H. Patricia McKenna
Cited by 5 | Viewed by 5806
Abstract
This paper explores the emerging and evolving landscape for metrics in smart cities in relation to big data challenges. Based on a review of the research literature, the problem of “synthetic quantitative indicators” along with concerns for “measuring urban realities” and “making metrics [...] Read more.
This paper explores the emerging and evolving landscape for metrics in smart cities in relation to big data challenges. Based on a review of the research literature, the problem of “synthetic quantitative indicators” along with concerns for “measuring urban realities” and “making metrics meaningful” are identified. In response, the purpose of this paper is to advance the need for innovating metrics for smarter, more interactive and responsive cities in addressing and mitigating algorithmic-related challenges on the one hand, and concerns associated with involving people more meaningfully on the other hand. As such, the constructs of awareness, learning, openness, and engagement are employed in this study. Using an exploratory case study approach, the research design for this work includes the use of multiple methods of data collection including survey and interviews. Employing a combination of content analysis for qualitative data and descriptive statistics for quantitative data, the main findings of this work support the need for rethinking and innovating metrics. As such, the main conclusion of this paper highlights the potential for developing new pathways and spaces for involving people more directly, knowingly, and meaningfully in addressing big and small data challenges for the innovating of urban metrics. Full article
(This article belongs to the Special Issue Big Data Challenges in Smart Cities)
Show Figures

Figure 1

4 pages, 331 KiB  
Data Descriptor
Dataset for Scheduling Strategies for Microgrids Coupled with Natural Gas Networks
by Muhammad Yousif, Qian Ai, Yang Gao, Waqas Ahmad Wattoo, Ran Hao and Ziqing Jiang
Cited by 4 | Viewed by 4346
Abstract
Datasets are significant for researchers to test the functionality of their proposed strategies for the microgrid dispatch. This article presents a dataset to help researchers in testing their algorithms related to the dispatch problem of microgrids coupled with natural gas networks. This preliminary [...] Read more.
Datasets are significant for researchers to test the functionality of their proposed strategies for the microgrid dispatch. This article presents a dataset to help researchers in testing their algorithms related to the dispatch problem of microgrids coupled with natural gas networks. This preliminary release of a microgrid dispatch dataset contains data related to microgrid components (like solar PV, wind turbine, fuel cell and batteries) and natural gas network elements connected with the microgrid (e.g., micro gas turbine). It also includes the data associated with the authors’ proposed scheduling strategy and its dispatch results. The provided dataset can be used to reproduce the authors’ proposed strategy. The presented dataset further can be used for comparisons of other researchers’ proposed strategies. These comparisons will make a strategy’s features more evident. Full article
Show Figures

Figure 1

22 pages, 3472 KiB  
Article
Data Preprocessing for Evaluation of Recommendation Models in E-Commerce
by Namrata Chaudhary and Drimik Roy Chowdhury
Cited by 6 | Viewed by 6080
Abstract
E-commerce businesses employ recommender models to assist in identifying a personalized set of products for each visitor. To accurately assess the recommendations’ influence on customer clicks and buys, three target areas—customer behavior, data collection, user-interface—will be explored for possible sources of erroneous data. [...] Read more.
E-commerce businesses employ recommender models to assist in identifying a personalized set of products for each visitor. To accurately assess the recommendations’ influence on customer clicks and buys, three target areas—customer behavior, data collection, user-interface—will be explored for possible sources of erroneous data. Varied customer behavior misrepresents the recommendations’ true influence on a customer due to the presence of B2B interactions and outlier customers. Non-parametric statistical procedures for outlier removal are delineated and other strategies are investigated to account for the effect of a large percentage of new customers or high bounce rates. Subsequently, in data collection we identify probable misleading interactions in the raw data, propose a robust method of tracking unique visitors, and accurately attributing the buy influence for combo products. Lastly, user-interface issues discuss the possible problems caused due to the recommendation widget’s positioning on the e-commerce website and the stringent conditions that should be imposed when utilizing data from the product listing page. This collective methodology results in an exact and valid estimation of the customer’s interactions influenced by the recommendation model in the context of standard industry metrics, such as Click-through rates, Buy-through rates, and Conversion revenue. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

9 pages, 3748 KiB  
Data Descriptor
Biogenic Volatiles Emitted from Four Cold-Hardy Grape Cultivars During Ripening
by Somchai Rice, Devin L. Maurer, Anne Fennell, Murlidhar Dharmadhikari and Jacek A. Koziel
Cited by 4 | Viewed by 4800
Abstract
In this research dataset, we summarize for the first time volatile organic compounds (VOCs) emitted in vivo from ripening wine grapes. We studied four cold-hardy cultivars grown in the Midwestern U.S.: St. Croix, Frontenac, Marquette, and La Crescent. These cultivars have gained popularity [...] Read more.
In this research dataset, we summarize for the first time volatile organic compounds (VOCs) emitted in vivo from ripening wine grapes. We studied four cold-hardy cultivars grown in the Midwestern U.S.: St. Croix, Frontenac, Marquette, and La Crescent. These cultivars have gained popularity among local growers and winemakers, but still very little is known about their performance compared with long-established V. vinifera grapes. Volatiles were collected using two novel approaches: biogenic emissions from grape clusters on a vine and single grape berries. A third approach was headspace collection of volatiles from crushed grapes. Solid-phase microextraction (SPME) was used to collect volatiles. Vacuum-assisted SPME was used in the case of single grape berry. Collected VOCs were analyzed using separation and identification on a gas chromatograph mass spectrometer (GC-MS). More than 120 VOCs were identified using mass spectral libraries. The dataset provides evidence that detecting biogenic emissions from growing grapes is feasible. The dataset provides a record of temporal and spatial variability of VOCs, many of which could potentially impart aroma and flavor in the wine. The number of VOCs detected followed the order from single berry (the least) to crushed berry (the most). Thus, more information for potential use in harvesting in order to obtain the desired flavor is found in data from crushed grapes. Full article
Show Figures

Figure 1

3 pages, 155 KiB  
Editorial
Special Issue on Astrophysics & Geophysics: Research and Applications
by Vladimir A. Srećković and Aleksandra Nina
Viewed by 3023
Abstract
The earth’s layers and space are media permanently exposed to the influences of numerous perturbations characterized by time- and space-dependent intensity. For this reason, the detection of astrophysical and terrestrial events and their influences, as well as the development and application of various [...] Read more.
The earth’s layers and space are media permanently exposed to the influences of numerous perturbations characterized by time- and space-dependent intensity. For this reason, the detection of astrophysical and terrestrial events and their influences, as well as the development and application of various models, must be based on observational data. The aim of this Special Issue, “Astrophysics & Geophysics: Research and Applications” in Data, is to engage a wide community of scientists to reorganize and expand current knowledge in this field. This Special Issue contains five articles, which include a wide range of topics such as big data in astrophysics and geophysics, data processing, visualization and acquisition, Earth observational data, remote sensing, etc. We hope that the topic of this Special Issue of Data will be of continued interest and we look forward to seeing progress in this field. Full article
(This article belongs to the Special Issue Data in Astrophysics & Geophysics: Research and Applications)
19 pages, 2950 KiB  
Article
Comparison of Micro-Census Results for Magarya Ward, Wurno Local Government Area of Sokoto State, Nigeria, with Other Sources of Denominator Data
by Margherita E. Ghiselli, Idongesit Nta Wilson, Brian Kaplan, Ndadilnasiya Endie Waziri, Adamu Sule, Halimatu Bolatito Ayanleke, Faruk Namalam, Shehu Ahmad Tambuwal, Nuruddeen Aliyu, Umar Kadi, Omotayo Bolu, Nyampa Barau, Mohammed Yahaya, Gideon Ugbenyo, Ugochukwu Osigwe, Clara Oguji, Nnamdi Usifoh and Vincent Seaman
Cited by 3 | Viewed by 6631
Abstract
Routine immunization coverage in Nigeria is suboptimal. In the northwestern state of Sokoto, an independent population-based survey for 2016 found immunization coverage with the third dose of Pentavalent vaccine to be 3%, whereas administrative coverage in 2016 was reported to be 69%. One [...] Read more.
Routine immunization coverage in Nigeria is suboptimal. In the northwestern state of Sokoto, an independent population-based survey for 2016 found immunization coverage with the third dose of Pentavalent vaccine to be 3%, whereas administrative coverage in 2016 was reported to be 69%. One possibility driving this large discrepancy is that administrative coverage is calculated using an under-estimated target population. Official population projections from the 2006 Census are based on state-specific standard population growth rates. Immunization target population estimates from other sources have not been independently validated. We conducted a micro-census in Magarya ward, Wurno Local Government Area of Sokoto state to obtain an accurate count of the total population living in the ward, and to compare these results with other sources of denominator data. We developed a precise micro-plan using satellite imagery, and used the navigation tool EpiSample v1 in the field to guide teams to each building, without duplications or omissions. The particular characteristics of the selected ward underscore the importance of using standardized shape files to draw precise boundaries for enumeration micro-plans. While the use of this methodology did not resolve the discrepancy between independent and administrative vaccination coverage rates, a simplified application can better define the target population for routine immunization services and estimate the number of children still unprotected from vaccine-preventable diseases. Full article
Show Figures

Figure 1

11 pages, 988 KiB  
Article
Gaussian Mixture and Kernel Density-Based Hybrid Model for Volatility Behavior Extraction From Public Financial Data
by Smail Tigani, Hasna Chaibi and Rachid Saadane
Cited by 3 | Viewed by 4794
Abstract
This paper carried out a hybrid clustering model for foreign exchange market volatility clustering. The proposed model is built using a Gaussian Mixture Model and the inference is done using an Expectation Maximization algorithm. A mono-dimensional kernel density estimator is used in order [...] Read more.
This paper carried out a hybrid clustering model for foreign exchange market volatility clustering. The proposed model is built using a Gaussian Mixture Model and the inference is done using an Expectation Maximization algorithm. A mono-dimensional kernel density estimator is used in order to build a probability density based on all historical observations. That allows us to evaluate the behavior’s probability of each symbol of interest. The computation result shows that the approach is able to pinpoint risky and safe hours to trade a given currency pair. Full article
(This article belongs to the Special Issue Data Analysis for Financial Markets)
Show Figures

Figure 1

11 pages, 4058 KiB  
Article
Towards Identifying Author Confidence in Biomedical Articles
by Mihaela Onofrei Plămadă, Diana Trandabăț and Daniela Gîfu
Cited by 2 | Viewed by 3238
Abstract
In an era where the volume of medical literature is increasing daily, researchers in the biomedical and clinical areas have joined efforts with language engineers to analyze the large amount of biomedical and molecular biology literature (such as PubMed), patient data, or health [...] Read more.
In an era where the volume of medical literature is increasing daily, researchers in the biomedical and clinical areas have joined efforts with language engineers to analyze the large amount of biomedical and molecular biology literature (such as PubMed), patient data, or health records. With such a huge amount of reports, evaluating their impact has long stopped being a trivial task. In this context, this paper intended to introduce a non-scientific factor that represents an important element in gaining acceptance of claims. We postulated that the confidence that an author has in expressing their work plays an important role in shaping the first impression that influences the reader’s perception of the paper. The results discussed in this paper were based on a series of experiments that were ran using data from the open archives initiative (OAI) corpus, which provides interoperability standards to facilitate effective dissemination of the content. This method may be useful to the direct beneficiaries (i.e., authors, who are engaged in medical or academic research), but also, to the researchers in the fields of biomedical text mining (BioNLP) and NLP, etc. Full article
(This article belongs to the Special Issue Curative Power of Medical Data)
Show Figures

Figure 1

29 pages, 3975 KiB  
Article
Statistical Modeling of Trivariate Static Systems: Isotonic Models
by Simone Fiori and Andrea Vitali
Viewed by 3629
Abstract
This paper presents an improved version of a statistical trivariate modeling algorithm introduced in a short Letter by the first author. This paper recalls the fundamental concepts behind the proposed algorithm, evidences its criticalities and illustrates a number of improvements which lead to [...] Read more.
This paper presents an improved version of a statistical trivariate modeling algorithm introduced in a short Letter by the first author. This paper recalls the fundamental concepts behind the proposed algorithm, evidences its criticalities and illustrates a number of improvements which lead to a functioning modeling algorithm. The present paper also illustrates the features of the improved statistical modeling algorithm through a comprehensive set of numerical experiments performed on four synthetic and five natural datasets. The obtained results confirm that the proposed algorithm is able to model the considered synthetic and the natural datasets faithfully. Full article
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop