Previous Issue
Volume 8, August
 
 

Big Data Cogn. Comput., Volume 8, Issue 9 (September 2024) – 17 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Select all
Export citation of selected articles as:
19 pages, 7056 KiB  
Article
A Data-Centric Approach to Understanding the 2020 U.S. Presidential Election
by Satish Mahadevan Srinivasan and Yok-Fong Paat
Big Data Cogn. Comput. 2024, 8(9), 111; https://doi.org/10.3390/bdcc8090111 (registering DOI) - 4 Sep 2024
Abstract
The application of analytics on Twitter feeds is a very popular field for research. A tweet with a 280-character limitation can reveal a wealth of information on how individuals express their sentiments and emotions within their network or community. Upon collecting, cleaning, and [...] Read more.
The application of analytics on Twitter feeds is a very popular field for research. A tweet with a 280-character limitation can reveal a wealth of information on how individuals express their sentiments and emotions within their network or community. Upon collecting, cleaning, and mining tweets from different individuals on a particular topic, we can capture not only the sentiments and emotions of an individual but also the sentiments and emotions expressed by a larger group. Using the well-known Lexicon-based NRC classifier, we classified nearly seven million tweets across seven battleground states in the U.S. to understand the emotions and sentiments expressed by U.S. citizens toward the 2020 presidential candidates. We used the emotions and sentiments expressed within these tweets as proxies for their votes and predicted the swing directions of each battleground state. When compared to the outcome of the 2020 presidential candidates, we were able to accurately predict the swing directions of four battleground states (Arizona, Michigan, Texas, and North Carolina), thus revealing the potential of this approach in predicting future election outcomes. The week-by-week analysis of the tweets using the NRC classifier corroborated well with the various political events that took place before the election, making it possible to understand the dynamics of the emotions and sentiments of the supporters in each camp. These research strategies and evidence-based insights may be translated into real-world settings and practical interventions to improve election outcomes. Full article
(This article belongs to the Special Issue Machine Learning in Data Mining for Knowledge Discovery)
Show Figures

Figure 1

19 pages, 714 KiB  
Article
Combining Semantic Matching, Word Embeddings, Transformers, and LLMs for Enhanced Document Ranking: Application in Systematic Reviews
by Goran Mitrov, Boris Stanoev, Sonja Gievska, Georgina Mirceva and Eftim Zdravevski
Big Data Cogn. Comput. 2024, 8(9), 110; https://doi.org/10.3390/bdcc8090110 - 4 Sep 2024
Abstract
The rapid increase in scientific publications has made it challenging to keep up with the latest advancements. Conducting systematic reviews using traditional methods is both time-consuming and difficult. To address this, new review formats like rapid and scoping reviews have been introduced, reflecting [...] Read more.
The rapid increase in scientific publications has made it challenging to keep up with the latest advancements. Conducting systematic reviews using traditional methods is both time-consuming and difficult. To address this, new review formats like rapid and scoping reviews have been introduced, reflecting an urgent need for efficient information retrieval. This challenge extends beyond academia to many organizations where numerous documents must be reviewed in relation to specific user queries. This paper focuses on improving document ranking to enhance the retrieval of relevant articles, thereby reducing the time and effort required by researchers. By applying a range of natural language processing (NLP) techniques, including rule-based matching, statistical text analysis, word embeddings, and transformer- and LLM-based approaches like Mistral LLM, we assess the article’s similarities to user-specific inputs and prioritize them according to relevance. We propose a novel methodology, Weighted Semantic Matching (WSM) + MiniLM, combining the strengths of the different methodologies. For validation, we employ global metrics such as precision at K, recall at K, average rank, median rank, and pairwise comparison metrics, including higher rank count, average rank difference, and median rank difference. Our proposed algorithm achieves optimal performance, with an average recall at 1000 of 95% and an average median rank of 185 for selected articles across the five datasets evaluated. These findings give promising results in pinpointing the relevant articles and reducing the manual work. Full article
(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining)
29 pages, 702 KiB  
Article
Supervised Density-Based Metric Learning Based on Bhattacharya Distance for Imbalanced Data Classification Problems
by Atena Jalali Mojahed, Mohammad Hossein Moattar and Hamidreza Ghaffari
Big Data Cogn. Comput. 2024, 8(9), 109; https://doi.org/10.3390/bdcc8090109 - 4 Sep 2024
Abstract
Learning distance metrics and distinguishing between samples from different classes are among the most important topics in machine learning. This article proposes a new distance metric learning approach tailored for highly imbalanced datasets. Imbalanced datasets suffer from a lack of data in the [...] Read more.
Learning distance metrics and distinguishing between samples from different classes are among the most important topics in machine learning. This article proposes a new distance metric learning approach tailored for highly imbalanced datasets. Imbalanced datasets suffer from a lack of data in the minority class, and the differences in class density strongly affect the efficiency of the classification algorithms. Therefore, the density of the classes is considered the main basis of learning the new distance metric. It is possible that the data of one class are composed of several densities, that is, the class is a combination of several normal distributions with different means and variances. In this paper, considering that classes may be multimodal, the distribution of each class is assumed in the form of a mixture of multivariate Gaussian densities. A density-based clustering algorithm is used for determining the number of components followed by the estimation of the parameters of the Gaussian components using maximum a posteriori density estimation. Then, the Bhattacharya distance between the Gaussian mixtures of the classes is maximized using an iterative scheme. To reach a large between-class margin, the distance between the external components is increased while decreasing the distance between the internal components. The proposed method is evaluated on 15 imbalanced datasets using the k-nearest neighbor (KNN) classifier. The results of the experiments show that using the proposed method significantly improves the efficiency of the classifier in imbalance classification problems. Also, when the imbalance ratio is very high and it is not possible to correctly identify minority class samples, the proposed method still provides acceptable performance. Full article
24 pages, 7001 KiB  
Article
Appendicitis Diagnosis: Ensemble Machine Learning and Explainable Artificial Intelligence-Based Comprehensive Approach
by Mohammed Gollapalli, Atta Rahman, Sheriff A. Kudos, Mohammed S. Foula, Abdullah Mahmoud Alkhalifa, Hassan Mohammed Albisher, Mohammed Taha Al-Hariri and Nazeeruddin Mohammad
Big Data Cogn. Comput. 2024, 8(9), 108; https://doi.org/10.3390/bdcc8090108 - 4 Sep 2024
Abstract
Appendicitis is a condition wherein the appendix becomes inflamed, and it can be difficult to diagnose accurately. The type of appendicitis can also be hard to determine, leading to misdiagnosis and difficulty in managing the condition. To avoid complications and reduce mortality, early [...] Read more.
Appendicitis is a condition wherein the appendix becomes inflamed, and it can be difficult to diagnose accurately. The type of appendicitis can also be hard to determine, leading to misdiagnosis and difficulty in managing the condition. To avoid complications and reduce mortality, early diagnosis and treatment are crucial. While Alvarado’s clinical scoring system is not sufficient, ultrasound and computed tomography (CT) imaging are effective but have downsides such as operator-dependency and radiation exposure. This study proposes the use of machine learning methods and a locally collected reliable dataset to enhance the identification of acute appendicitis while detecting the differences between complicated and non-complicated appendicitis. Machine learning can help reduce diagnostic errors and improve treatment decisions. This study conducted four different experiments using various ML algorithms, including K-nearest neighbors (KNN), DT, bagging, and stacking. The experimental results showed that the stacking model had the highest training accuracy, test set accuracy, precision, and F1 score, which were 97.51%, 92.63%, 95.29%, and 92.04%, respectively. Feature importance and explainable AI (XAI) identified neutrophils, WBC_Count, Total_LOS, P_O_LOS, and Symptoms_Days as the principal features that significantly affected the performance of the model. Based on the outcomes and feedback from medical health professionals, the scheme is promising in terms of its effectiveness in diagnosing of acute appendicitis. Full article
(This article belongs to the Special Issue Machine Learning Applications and Big Data Challenges)
Show Figures

Figure 1

15 pages, 450 KiB  
Article
A Comparative Study of Sentiment Classification Models for Greek Reviews
by Panagiotis D. Michailidis
Big Data Cogn. Comput. 2024, 8(9), 107; https://doi.org/10.3390/bdcc8090107 - 4 Sep 2024
Abstract
In recent years, people have expressed their opinions and sentiments about products, services, and other issues on social media platforms and review websites. These sentiments are typically classified as either positive or negative based on their text content. Research interest in sentiment analysis [...] Read more.
In recent years, people have expressed their opinions and sentiments about products, services, and other issues on social media platforms and review websites. These sentiments are typically classified as either positive or negative based on their text content. Research interest in sentiment analysis for text reviews written in Greek is limited compared to that in English. Existing studies conducted for the Greek language have focused more on posts collected from social media platforms rather than on consumer reviews from e-commerce websites and have primarily used traditional machine learning (ML) methods, with little to no work utilizing advanced methods like neural networks, transfer learning, and large language models. This study addresses this gap by testing the hypothesis that modern methods for sentiment classification, including artificial neural networks (ANNs), transfer learning (TL), and large language models (LLMs), perform better than traditional ML models in analyzing a Greek consumer review dataset. Several classification methods, namely, ML, ANNs, TL, and LLMs, were evaluated and compared using performance metrics on a large collection of Greek product reviews. The empirical findings showed that the GreekBERT and GPT-4 models perform significantly better than traditional ML classifiers, with BERT achieving an accuracy of 96% and GPT-4 reaching 95%, while ANNs showed similar performance to ML models. This study confirms the hypothesis, with the BERT model achieving the highest classification accuracy. Full article
Show Figures

Figure 1

18 pages, 2336 KiB  
Article
Performance and Board Diversity: A Practical AI Perspective
by Lee-Wen Yang, Thi Thanh Binh Nguyen and Wei-Ju Young
Big Data Cogn. Comput. 2024, 8(9), 106; https://doi.org/10.3390/bdcc8090106 - 4 Sep 2024
Abstract
The face of corporate governance is changing as new technologies in the scope of artificial intelligence and data analytics are used to make better future-oriented decisions on performance management. This study attempts to provide empirical results to analyze when the impact of diversity [...] Read more.
The face of corporate governance is changing as new technologies in the scope of artificial intelligence and data analytics are used to make better future-oriented decisions on performance management. This study attempts to provide empirical results to analyze when the impact of diversity on the board of directors is most evident through the multi-breaks model and artificial neural networks. The input data for the simulation includes 853 electronic companies listed on the Taiwan Stock Exchange from 2000 to 2021. The empirical results show that the higher the percentage of female board members, the more influential the company’s performance is, which is only evident when the company is in good business condition. By integrating ANNs with multi-breakpoint regression, this study introduces a novel approach to management research, providing a detailed perspective on how board diversity impacts firm performance across different conditions. The ANN results show that using the number of business board members for predicting Return on Assets yields the highest accuracy, with female board members following closely in predictive effectiveness. The presence of women on the board contributes positively to ROA, particularly when the company is experiencing favorable business conditions and high profitability. Our analysis also reveals that a higher percentage of male board members improves company performance, but this benefit is observed only in highly favorable and unfavorable business conditions. Conversely, a higher percentage of business members tends to affect performance during periods of high profitability negatively. The power of the board of directors and significant shareholders is positively correlated with performance, whereas CEO power positively impacts performance only when it is not extremely low. Independent board members generally do not have a significant effect on profits. Additionally, the company’s asset value positively influences performance primarily when the return on assets is high, and increased financial leverage is associated with reduced profitability. Full article
(This article belongs to the Special Issue Machine Learning Applications and Big Data Challenges)
Show Figures

Figure 1

18 pages, 723 KiB  
Article
Ethical AI in Financial Inclusion: The Role of Algorithmic Fairness on User Satisfaction and Recommendation
by Qin Yang and Young-Chan Lee
Big Data Cogn. Comput. 2024, 8(9), 105; https://doi.org/10.3390/bdcc8090105 - 3 Sep 2024
Viewed by 422
Abstract
This study investigates the impact of artificial intelligence (AI) on financial inclusion satisfaction and recommendation, with a focus on the ethical dimensions and perceived algorithmic fairness. Drawing upon organizational justice theory and the heuristic–systematic model, we examine how algorithm transparency, accountability, and legitimacy [...] Read more.
This study investigates the impact of artificial intelligence (AI) on financial inclusion satisfaction and recommendation, with a focus on the ethical dimensions and perceived algorithmic fairness. Drawing upon organizational justice theory and the heuristic–systematic model, we examine how algorithm transparency, accountability, and legitimacy influence users’ perceptions of fairness and, subsequently, their satisfaction with and likelihood to recommend AI-driven financial inclusion services. Through a survey-based quantitative analysis of 675 users in China, our results reveal that perceived algorithmic fairness acts as a significant mediating factor between the ethical attributes of AI systems and the user responses. Specifically, higher levels of transparency, accountability, and legitimacy enhance users’ perceptions of fairness, which, in turn, significantly increases both their satisfaction with AI-facilitated financial inclusion services and their likelihood to recommend them. This research contributes to the literature on AI ethics by empirically demonstrating the critical role of transparent, accountable, and legitimate AI practices in fostering positive user outcomes. Moreover, it addresses a significant gap in the understanding of the ethical implications of AI in financial inclusion contexts, offering valuable insights for both researchers and practitioners in this rapidly evolving field. Full article
Show Figures

Figure 1

23 pages, 4464 KiB  
Article
A Hybrid Segmentation Algorithm for Rheumatoid Arthritis Diagnosis Using X-ray Images
by Govindan Rajesh, Nandagopal Malarvizhi and Man-Fai Leung
Big Data Cogn. Comput. 2024, 8(9), 104; https://doi.org/10.3390/bdcc8090104 - 2 Sep 2024
Viewed by 342
Abstract
Rheumatoid Arthritis (RA) is a chronic autoimmune illness that occurs in the joints, resulting in inflammation, pain, and stiffness. X-ray examination is one of the most common diagnostic procedures for RA, but manual X-ray image analysis has limitations because it is a time-consuming [...] Read more.
Rheumatoid Arthritis (RA) is a chronic autoimmune illness that occurs in the joints, resulting in inflammation, pain, and stiffness. X-ray examination is one of the most common diagnostic procedures for RA, but manual X-ray image analysis has limitations because it is a time-consuming procedure and is prone to errors. A specific algorithm aims to a lay stable and accurate segmenting of carpal bones from hand bone images, which is vitally important for identifying rheumatoid arthritis. The algorithm demonstrates several stages, starting with Carpal bone Region of Interest (CROI) specification, dynamic thresholding, and Gray Level Co-occurrence Matrix (GLCM) application for texture analysis. To get the clear edges of the image, the component is first converted to the greyscale function and thresholding is carried out to separate the hand from the background. The pad region is identified to obtain the contours of it, and the CROI is defined by the bounding box of the largest contour. The threshold value used in the CROI method is given a dynamic feature that can separate the carpal bones from the surrounding tissue. Then the GLCM texture analysis is carried out, calculating the number of pixel neighbors, with the specific intensity and neighbor relations of the pixels. The resulting feature matrix is then employed to extract features such as contrast and energy, which are later used to categorize the images of the affected carpal bone into inflamed and normal. The proposed technique is tested on a rheumatoid arthritis image dataset, and the results show its contribution to diagnosis of the disease. The algorithm efficiently divides carpal bones and extracts the signature parameters that are critical for correct classification of the inflammation in the cartilage images. Full article
Show Figures

Figure 1

21 pages, 3639 KiB  
Article
AHEAD: A Novel Technique Combining Anti-Adversarial Hierarchical Ensemble Learning with Multi-Layer Multi-Anomaly Detection for Blockchain Systems
by Muhammad Kamran, Muhammad Maaz Rehan, Wasif Nisar and Muhammad Waqas Rehan
Big Data Cogn. Comput. 2024, 8(9), 103; https://doi.org/10.3390/bdcc8090103 - 2 Sep 2024
Viewed by 322
Abstract
Blockchain technology has impacted various sectors and is transforming them through its decentralized, immutable, transparent, smart contracts (automatically executing digital agreements) and traceable attributes. Due to the adoption of blockchain technology in versatile applications, millions of transactions take place globally. These transactions are [...] Read more.
Blockchain technology has impacted various sectors and is transforming them through its decentralized, immutable, transparent, smart contracts (automatically executing digital agreements) and traceable attributes. Due to the adoption of blockchain technology in versatile applications, millions of transactions take place globally. These transactions are no exception to adversarial attacks which include data tampering, double spending, data corruption, Sybil attacks, eclipse attacks, DDoS attacks, P2P network partitioning, delay attacks, selfish mining, bribery, fake transactions, fake wallets or phishing, false advertising, malicious smart contracts, and initial coin offering scams. These adversarial attacks result in operational, financial, and reputational losses. Although numerous studies have proposed different blockchain anomaly detection mechanisms, challenges persist. These include detecting anomalies in just a single layer instead of multiple layers, targeting a single anomaly instead of multiple, not encountering adversarial machine learning attacks (for example, poisoning, evasion, and model extraction attacks), and inadequate handling of complex transactional data. The proposed AHEAD model solves the above problems by providing the following: (i) data aggregation transformation to detect transactional and user anomalies at the data and network layers of the blockchain, respectively, (ii) a Three-Layer Hierarchical Ensemble Learning Model (HELM) incorporating stratified random sampling to add resilience against adversarial attacks, and (iii) an advanced preprocessing technique with hybrid feature selection to handle complex transactional data. The performance analysis of the proposed AHEAD model shows that it achieves higher anti-adversarial resistance and detects multiple anomalies at the data and network layers. A comparison of the proposed AHEAD model with other state-of-the-art models shows that it achieves 98.85% accuracy against anomaly detection on data and network layers targeting transaction and user anomalies, along with 95.97% accuracy against adversarial machine learning attacks, which surpassed other models. Full article
Show Figures

Figure 1

27 pages, 7680 KiB  
Article
Federated Learning with Multi-Method Adaptive Aggregation for Enhanced Defect Detection in Power Systems
by Linghao Zhang, Bing Bian, Linyu Luo, Siyang Li and Hongjun Wang
Big Data Cogn. Comput. 2024, 8(9), 102; https://doi.org/10.3390/bdcc8090102 - 2 Sep 2024
Viewed by 246
Abstract
The detection and identification of defects in transmission lines using computer vision techniques is essential for maintaining the safety and reliability of power supply systems. However, existing training methods for transmission line defect detection models predominantly rely on single-node training, potentially limiting the [...] Read more.
The detection and identification of defects in transmission lines using computer vision techniques is essential for maintaining the safety and reliability of power supply systems. However, existing training methods for transmission line defect detection models predominantly rely on single-node training, potentially limiting the enhancement of detection accuracy. To tackle this issue, this paper proposes a server-side adaptive parameter aggregation algorithm based on multi-method fusion (SAPAA-MMF) and formulates the corresponding objective function. Within the federated learning framework proposed in this paper, each client executes distributed synchronous training in alignment with the fundamental process of federated learning. The hierarchical difference between the global model, aggregated using the improved joint mean algorithm, and the global model from the previous iteration is computed and utilized as the pseudo-gradient for the adaptive aggregation algorithm. This enables the adaptive aggregation to produce a new global model with improved performance. To evaluate the potential of SAPAA-MMF, comprehensive experiments were conducted on five datasets, involving comparisons with several algorithms. The experimental results are analyzed independently for both the server and client sides. The findings indicate that SAPAA-MMF outperforms existing federated learning algorithms on both the server and client sides. Full article
Show Figures

Figure 1

27 pages, 2771 KiB  
Article
Contextual Intelligence: An AI Approach to Manufacturing Skills’ Forecasting
by Xolani Maphisa, Mpho Nkadimeng and Arnesh Telukdarie
Big Data Cogn. Comput. 2024, 8(9), 101; https://doi.org/10.3390/bdcc8090101 - 2 Sep 2024
Viewed by 615
Abstract
The manufacturing industry is skill-intensive and plays a pivotal role in South Africa’s economy, reflecting the nation’s progress and development. The advent of technology has initiated a transformative era within the manufacturing sector. Workforce skills are at the heart of ensuring the sustained [...] Read more.
The manufacturing industry is skill-intensive and plays a pivotal role in South Africa’s economy, reflecting the nation’s progress and development. The advent of technology has initiated a transformative era within the manufacturing sector. Workforce skills are at the heart of ensuring the sustained growth of the industry. This study delves into the skill-related aspects of the occupational landscape of the South African manufacturing sector, with a particular focus on two important manufacturing sectors: the food and beverage manufacturing (FoodBev) sector and the chemical manufacturing (CHIETA) sector. Leveraging the forecasting prowess of Autoregressive Integrated Moving Average (ARIMA), this paper outlines a sectorial occupational forecasting modeling exercise to reveal which job roles are poised for expansion and which are expected to decline. The approach predicted future skills’ demand 80% accuracy for 473 out of 713 (66%) occupations for FoodBev and 474 out of 522 (91%) for CHIETA. These insights are invaluable for industry stakeholders and educational institutions, providing guidance to support the sector’s growth in an era marked by technological advancement. Full article
Show Figures

Figure 1

22 pages, 9693 KiB  
Article
A Trusted Supervision Paradigm for Autonomous Driving Based on Multimodal Data Authentication
by Tianyi Shi, Ruixiao Wu, Chuantian Zhou, Siyang Zheng, Zhu Meng, Zhe Cui, Jin Huang, Changrui Ren and Zhicheng Zhao
Big Data Cogn. Comput. 2024, 8(9), 100; https://doi.org/10.3390/bdcc8090100 - 2 Sep 2024
Viewed by 355
Abstract
At the current stage of autonomous driving, monitoring the behavior of safety stewards (drivers) is crucial to establishing liability in the event of an accident. However, there is currently no method for the quantitative assessment of safety steward behavior that is trusted by [...] Read more.
At the current stage of autonomous driving, monitoring the behavior of safety stewards (drivers) is crucial to establishing liability in the event of an accident. However, there is currently no method for the quantitative assessment of safety steward behavior that is trusted by multiple stakeholders. In recent years, deep-learning-based methods can automatically detect abnormal behaviors with surveillance video, and blockchain as a decentralized and tamper-resistant distributed ledger technology is very suitable as a tool for providing evidence when determining liability. In this paper, a trusted supervision paradigm for autonomous driving (TSPAD) based on multimodal data authentication is proposed. Specifically, this paradigm consists of a deep learning model for driving abnormal behavior detection based on key frames adaptive selection and a blockchain system for multimodal data on-chaining and certificate storage. First, the deep-learning-based detection model enables the quantification of abnormal driving behavior and the selection of key frames. Second, the key frame selection and image compression coding balance the trade-off between the amount of information and efficiency in multiparty data sharing. Third, the blockchain-based data encryption sharing strategy ensures supervision and mutual trust among the regulatory authority, the logistic platform, and the enterprise in the driving process. Full article
(This article belongs to the Special Issue Big Data Analytics and Edge Computing: Recent Trends and Future)
Show Figures

Figure 1

36 pages, 4696 KiB  
Review
Review of Federated Learning and Machine Learning-Based Methods for Medical Image Analysis
by Netzahualcoyotl Hernandez-Cruz, Pramit Saha, Md Mostafa Kamal Sarker and J. Alison Noble
Big Data Cogn. Comput. 2024, 8(9), 99; https://doi.org/10.3390/bdcc8090099 - 28 Aug 2024
Viewed by 360
Abstract
Federated learning is an emerging technology that enables the decentralised training of machine learning-based methods for medical image analysis across multiple sites while ensuring privacy. This review paper thoroughly examines federated learning research applied to medical image analysis, outlining technical contributions. We followed [...] Read more.
Federated learning is an emerging technology that enables the decentralised training of machine learning-based methods for medical image analysis across multiple sites while ensuring privacy. This review paper thoroughly examines federated learning research applied to medical image analysis, outlining technical contributions. We followed the guidelines of Okali and Schabram, a review methodology, to produce a comprehensive summary and discussion of the literature in information systems. Searches were conducted at leading indexing platforms: PubMed, IEEE Xplore, Scopus, ACM, and Web of Science. We found a total of 433 papers and selected 118 of them for further examination. The findings highlighted research on applying federated learning to neural network methods in cardiology, dermatology, gastroenterology, neurology, oncology, respiratory medicine, and urology. The main challenges reported were the ability of machine learning models to adapt effectively to real-world datasets and privacy preservation. We outlined two strategies to address these challenges: non-independent and identically distributed data and privacy-enhancing methods. This review paper offers a reference overview for those already working in the field and an introduction to those new to the topic. Full article
Show Figures

Figure 1

25 pages, 755 KiB  
Article
Ontology Merging Using the Weak Unification of Concepts
by Norman Kuusik and Jüri Vain
Big Data Cogn. Comput. 2024, 8(9), 98; https://doi.org/10.3390/bdcc8090098 - 27 Aug 2024
Viewed by 292
Abstract
Knowledge representation and manipulation in knowledge-based systems typically rely on ontologies. The aim of this work is to provide a novel weak unification-based method and an automatic tool for OWL ontology merging to ensure well-coordinated task completion in the context of collaborative agents. [...] Read more.
Knowledge representation and manipulation in knowledge-based systems typically rely on ontologies. The aim of this work is to provide a novel weak unification-based method and an automatic tool for OWL ontology merging to ensure well-coordinated task completion in the context of collaborative agents. We employ a technique based on integrating string and semantic matching with the additional consideration of structural heterogeneity of concepts. The tool is implemented in Prolog and makes use of its inherent unification mechanism. Experiments were run on an OAEI data set with a matching accuracy of 60% across 42 tests. Additionally, we ran the tool on several ontologies from the domain of robotics. producing a small, but generally accurate, set of matched concepts. These results clearly show a good capability of the method and the tool to match semantically similar concepts. The results also highlight the challenges related to the evaluation of ontology-merging algorithms without a definite ground truth. Full article
(This article belongs to the Special Issue Recent Advances in Big Data-Driven Prescriptive Analytics)
Show Figures

Figure 1

15 pages, 1856 KiB  
Article
DaSAM: Disease and Spatial Attention Module-Based Explainable Model for Brain Tumor Detection
by Sara Tehsin, Inzamam Mashood Nasir, Robertas Damaševičius and Rytis Maskeliūnas
Big Data Cogn. Comput. 2024, 8(9), 97; https://doi.org/10.3390/bdcc8090097 - 25 Aug 2024
Viewed by 493
Abstract
Brain tumors are the result of irregular development of cells. It is a major cause of adult demise worldwide. Several deaths can be avoided with early brain tumor detection. Magnetic resonance imaging (MRI) for earlier brain tumor diagnosis may improve the chance of [...] Read more.
Brain tumors are the result of irregular development of cells. It is a major cause of adult demise worldwide. Several deaths can be avoided with early brain tumor detection. Magnetic resonance imaging (MRI) for earlier brain tumor diagnosis may improve the chance of survival for patients. The most common method of diagnosing brain tumors is MRI. The improved visibility of malignancies in MRI makes therapy easier. The diagnosis and treatment of brain cancers depend on their identification and treatment. Numerous deep learning models are proposed over the last decade including Alexnet, VGG, Inception, ResNet, DenseNet, etc. All these models are trained on a huge dataset, ImageNet. These general models have many parameters, which become irrelevant when implementing these models for a specific problem. This study uses a custom deep-learning model for the classification of brain MRIs. The proposed Disease and Spatial Attention Model (DaSAM) has two modules; (a) the Disease Attention Module (DAM), to distinguish between disease and non-disease regions of an image, and (b) the Spatial Attention Module (SAM), to extract important features. The experiments of the proposed model are conducted on two multi-class datasets that are publicly available, the Figshare and Kaggle datasets, where it achieves precision values of 99% and 96%, respectively. The proposed model is also tested using cross-dataset validation, where it achieved 85% accuracy when trained on the Figshare dataset and validated on the Kaggle dataset. The incorporation of DAM and SAM modules enabled the functionality of feature mapping, which proved to be useful for the highlighting of important features during the decision-making process of the model. Full article
Show Figures

Figure 1

15 pages, 552 KiB  
Article
An Efficient Algorithm for Sorting and Duplicate Elimination by Using Logarithmic Prime Numbers
by Wei-Chang Yeh and Majid Forghani-elahabad
Big Data Cogn. Comput. 2024, 8(9), 96; https://doi.org/10.3390/bdcc8090096 - 23 Aug 2024
Viewed by 298
Abstract
Data structures such as sets, lists, and arrays are fundamental in mathematics and computer science, playing a crucial role in numerous real-life applications. These structures represent a variety of entities, including solutions, conditions, and objectives. In scenarios involving large datasets, eliminating duplicate elements [...] Read more.
Data structures such as sets, lists, and arrays are fundamental in mathematics and computer science, playing a crucial role in numerous real-life applications. These structures represent a variety of entities, including solutions, conditions, and objectives. In scenarios involving large datasets, eliminating duplicate elements is essential to reduce complexity and enhance performance. This paper introduces a novel algorithm that uses logarithmic prime numbers to efficiently sort data structures and remove duplicates. The algorithm is mathematically rigorous, ensuring correctness and providing a thorough analysis of its time complexity. To demonstrate its practicality and effectiveness, we compare our method with existing algorithms, highlighting its superior speed and accuracy. An extensive experimental analysis across one thousand random test problems shows that our approach significantly outperforms two alternative techniques from the literature. By discussing the potential applications of the proposed algorithm in various domains, including computer science, engineering, and data management, we illustrate its adaptability through two practical examples in which our algorithm solves the problem more than 3×104 and 7×104 times faster than the existing algorithms in the literature. The results of these examples demonstrate that the superiority of our algorithm becomes increasingly pronounced with larger problem sizes. Full article
Show Figures

Figure 1

31 pages, 2308 KiB  
Review
Data Privacy and Security in Autonomous Connected Vehicles in Smart City Environment
by Tanweer Alam
Big Data Cogn. Comput. 2024, 8(9), 95; https://doi.org/10.3390/bdcc8090095 - 23 Aug 2024
Viewed by 447
Abstract
A self-driving vehicle can navigate autonomously in smart cities without the need for human intervention. The emergence of Autonomous Connected Vehicles (ACVs) poses a substantial threat to public and passenger safety due to the possibility of cyber-attacks, which encompass remote hacking, manipulation of [...] Read more.
A self-driving vehicle can navigate autonomously in smart cities without the need for human intervention. The emergence of Autonomous Connected Vehicles (ACVs) poses a substantial threat to public and passenger safety due to the possibility of cyber-attacks, which encompass remote hacking, manipulation of sensor data, and probable disablement or accidents. The sensors collect data to facilitate the network’s recognition of local landmarks, such as trees, curbs, pedestrians, signs, and traffic lights. ACVs gather vast amounts of data, encompassing the exact geographical coordinates of the vehicle, captured images, and signals received from various sensors. To create a fully autonomous system, it is imperative to intelligently integrate several technologies, such as sensors, communication, computation, machine learning (ML), data analytics, and other technologies. The primary issues in ACVs involve data privacy and security when instantaneously exchanging substantial volumes of data. This study investigates related data security and privacy research in ACVs using the Blockchain-enabled Federated Reinforcement Learning (BFRL) framework. This paper provides a literature review examining data security and privacy in ACVs and the BFRL framework that can be used to protect ACVs. This study presents the integration of FRL and Blockchain (BC) in the context of smart cities. Furthermore, the challenges and opportunities for future research on ACVs utilising BFRL frameworks are discussed. Full article
Show Figures

Figure 1

Previous Issue
Back to TopTop