Journal Description
Journal of Imaging
Journal of Imaging
is an international, multi/interdisciplinary, peer-reviewed, open access journal of imaging techniques published online monthly by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), PubMed, PMC, dblp, Inspec, Ei Compendex, and other databases.
- Journal Rank: CiteScore - Q1 (Computer Graphics and Computer-Aided Design)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 18.3 days after submission; acceptance to publication is undertaken in 3.3 days (median values for papers published in this journal in the second half of 2024).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
2.7 (2023);
5-Year Impact Factor:
3.0 (2023)
Latest Articles
A Local Adversarial Attack with a Maximum Aggregated Region Sparseness Strategy for 3D Objects
J. Imaging 2025, 11(1), 25; https://doi.org/10.3390/jimaging11010025 - 13 Jan 2025
Abstract
►
Show Figures
The increasing reliance on deep neural network-based object detection models in various applications has raised significant security concerns due to their vulnerability to adversarial attacks. In physical 3D environments, existing adversarial attacks that target object detection (3D-AE) face significant challenges. These attacks often
[...] Read more.
The increasing reliance on deep neural network-based object detection models in various applications has raised significant security concerns due to their vulnerability to adversarial attacks. In physical 3D environments, existing adversarial attacks that target object detection (3D-AE) face significant challenges. These attacks often require large and dispersed modifications to objects, making them easily noticeable and reducing their effectiveness in real-world scenarios. To maximize the attack effectiveness, large and dispersed attack camouflages are often employed, which makes the camouflages overly conspicuous and reduces their visual stealth. The core issue is how to use minimal and concentrated camouflage to maximize the attack effect. Addressing this, our research focuses on developing more subtle and efficient attack methods that can better evade detection in practical settings. Based on these principles, this paper proposes a local 3D attack method driven by a Maximum Aggregated Region Sparseness (MARS) strategy. In simpler terms, our approach strategically concentrates the attack modifications to specific areas to enhance effectiveness while maintaining stealth. To maximize the aggregation of attack-camouflaged regions, an aggregation regularization term is designed to constrain the mask aggregation matrix based on the face-adjacency relationships. To minimize the attack camouflage regions, a sparseness regularization is designed to make the mask weights tend toward a U-shaped distribution and limit extreme values. Additionally, neural rendering is used to obtain gradient-propagating multi-angle augmented data and suppress the model’s detection to locate universal critical decision regions from multiple angles. These technical strategies ensure that the adversarial modifications remain effective across different viewpoints and conditions. We test the attack effectiveness of different region selection strategies. On the CARLA dataset, the average attack efficiency of attacking the YOLOv3 and v5 series networks reaches 1.724, which represents an improvement of 0.986 (134%) compared to baseline methods. These results demonstrate a significant enhancement in attack performance, highlighting the potential risks to real-world object detection systems. The experimental results demonstrate that our attack method achieves both stealth and aggressiveness from different viewpoints. Furthermore, we explore the transferability of the decision regions. The results indicate that our method can be effectively combined with different texture optimization methods, with the average precision decreasing by 0.488 and 0.662 across different networks, which indicates a strong attack effectiveness.
Full article
Open AccessArticle
LittleFaceNet: A Small-Sized Face Recognition Method Based on RetinaFace and AdaFace
by
Zhengwei Ren, Xinyu Liu, Jing Xu, Yongsheng Zhang and Ming Fang
J. Imaging 2025, 11(1), 24; https://doi.org/10.3390/jimaging11010024 - 13 Jan 2025
Abstract
For surveillance video management in university laboratories, issues such as occlusion and low-resolution face capture often arise. Traditional face recognition algorithms are typically static and rely heavily on clear images, resulting in inaccurate recognition for low-resolution, small-sized faces. To address the challenges of
[...] Read more.
For surveillance video management in university laboratories, issues such as occlusion and low-resolution face capture often arise. Traditional face recognition algorithms are typically static and rely heavily on clear images, resulting in inaccurate recognition for low-resolution, small-sized faces. To address the challenges of occlusion and low-resolution person identification, this paper proposes a new face recognition framework by reconstructing Retinaface-Resnet and combining it with Quality-Adaptive Margin (adaface). Currently, although there are many target detection algorithms, they all require a large amount of data for training. However, datasets for low-resolution face detection are scarce, leading to poor detection performance of the models. This paper aims to solve Retinaface’s weak face recognition capability in low-resolution scenarios and its potential inaccuracies in face bounding box localization when faces are at extreme angles or partially occluded. To this end, Spatial Depth-wise Separable Convolutions are introduced. Retinaface-Resnet is designed for face detection and localization, while adaface is employed to address low-resolution face recognition by using feature norm approximation to estimate image quality and applying an adaptive margin function. Additionally, a multi-object tracking algorithm is used to solve the problem of moving occlusion. Experimental results demonstrate significant improvements, achieving an accuracy of 96.12% on the WiderFace dataset and a recognition accuracy of 84.36% in practical laboratory applications.
Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
►▼
Show Figures
Figure 1
Open AccessArticle
An Infrared and Visible Image Alignment Method Based on Gradient Distribution Properties and Scale-Invariant Features in Electric Power Scenes
by
Lin Zhu, Yuxing Mao, Chunxu Chen and Lanjia Ning
J. Imaging 2025, 11(1), 23; https://doi.org/10.3390/jimaging11010023 - 13 Jan 2025
Abstract
In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected
[...] Read more.
In grid intelligent inspection systems, automatic registration of infrared and visible light images in power scenes is a crucial research technology. Since there are obvious differences in key attributes between visible and infrared images, direct alignment is often difficult to achieve the expected results. To overcome the high difficulty of aligning infrared and visible light images, an image alignment method is proposed in this paper. First, we use the Sobel operator to extract the edge information of the image pair. Second, the feature points in the edges are recognised by a curvature scale space (CSS) corner detector. Third, the Histogram of Orientation Gradients (HOG) is extracted as the gradient distribution characteristics of the feature points, which are normalised with the Scale Invariant Feature Transform (SIFT) algorithm to form feature descriptors. Finally, initial matching and accurate matching are achieved by the improved fast approximate nearest-neighbour matching method and adaptive thresholding, respectively. Experiments show that this method can robustly match the feature points of image pairs under rotation, scale, and viewpoint differences, and achieves excellent matching results.
Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
►▼
Show Figures
Graphical abstract
Open AccessArticle
ZooCNN: A Zero-Order Optimized Convolutional Neural Network for Pneumonia Classification Using Chest Radiographs
by
Saravana Kumar Ganesan, Parthasarathy Velusamy, Santhosh Rajendran, Ranjithkumar Sakthivel, Manikandan Bose and Baskaran Stephen Inbaraj
J. Imaging 2025, 11(1), 22; https://doi.org/10.3390/jimaging11010022 - 13 Jan 2025
Abstract
►▼
Show Figures
Pneumonia, a leading cause of mortality in children under five, is usually diagnosed through chest X-ray (CXR) images due to its efficiency and cost-effectiveness. However, the shortage of radiologists in the Least Developed Countries (LDCs) emphasizes the need for automated pneumonia diagnostic systems.
[...] Read more.
Pneumonia, a leading cause of mortality in children under five, is usually diagnosed through chest X-ray (CXR) images due to its efficiency and cost-effectiveness. However, the shortage of radiologists in the Least Developed Countries (LDCs) emphasizes the need for automated pneumonia diagnostic systems. This article presents a Deep Learning model, Zero-Order Optimized Convolutional Neural Network (ZooCNN), a Zero-Order Optimization (Zoo)-based CNN model for classifying CXR images into three classes, Normal Lungs (NL), Bacterial Pneumonia (BP), and Viral Pneumonia (VP); this model utilizes the Adaptive Synthetic Sampling (ADASYN) approach to ensure class balance in the Kaggle CXR Images (Pneumonia) dataset. Conventional CNN models, though promising, face challenges such as overfitting and have high computational costs. The use of ZooPlatform (ZooPT), a hyperparameter finetuning strategy, on a baseline CNN model finetunes the hyperparameters and provides a modified architecture, ZooCNN, with a 72% reduction in weights. The model was trained, tested, and validated on the Kaggle CXR Images (Pneumonia) dataset. The ZooCNN achieved an accuracy of 97.27%, a sensitivity of 97.00%, a specificity of 98.60%, and an F1 score of 97.03%. The results were compared with contemporary models to highlight the efficacy of the ZooCNN in pneumonia classification (PC), offering a potential tool to aid physicians in clinical settings.
Full article
Figure 1
Open AccessArticle
Typical and Local Diagnostic Reference Levels for Chest and Abdomen Radiography Examinations in Dubai Health Sector
by
Entesar Z. Dalah, Maitha M. Al Zarooni, Faryal Y. Binismail, Hashim A. Beevi, Mohammed Siraj and Subrahmanian Pottybindu
J. Imaging 2025, 11(1), 21; https://doi.org/10.3390/jimaging11010021 - 13 Jan 2025
Abstract
Chest and abdomen radiographs are the most common radiograph examinations conducted in the Dubai Health sector, with both involving exposure to several radiosensitive organs. Diagnostic reference levels (DRLs) are accepted as an effective safety, optimization, and auditing tool in clinical practice. The present
[...] Read more.
Chest and abdomen radiographs are the most common radiograph examinations conducted in the Dubai Health sector, with both involving exposure to several radiosensitive organs. Diagnostic reference levels (DRLs) are accepted as an effective safety, optimization, and auditing tool in clinical practice. The present work aims to establish a comprehensive projection and weight-based structured DRL system that allows one to confidently highlight healthcare centers in need of urgent action. The data of a total of 5474 adult males and non-pregnant females who underwent chest and abdomen radiography examinations in five different healthcare centers were collected and retrospectively analyzed. The typical DRL (TDRL) for each healthcare center was established and defined per projection (chest: posterior–anterior (PA), anterior–posterior (AP) and lateral (LAT); abdomen: erect and supine) for a weight band (60–80 kg) and for the whole data (no weight band). Local DRL (LDRL) values were established per project for the selected radiograph for the whole data (no weight band) and the 60–80 kg population. Chest radiography data from 1755 (60–80 kg) images were used to build this comprehensive DRL system (PA: 1471, AP: 252, and LAT: 32). Similarly, 611 (60–80 kg) abdomen radiographs were used to establish a DRL system (erect: 286 and supine: 325). The LDRL values defined per chest and abdomen projection for the weight band group (60–80 kg) were as follows: chest—0.51 PA, 2.46 AP, and 2.13 LAT dGy·cm2; abdomen—8.08 for erect and 5.95 for supine dGy·cm2. The LDRL defined per abdomen projection for the 60–80 kg weight band highlighted at least one healthcare center in need of optimization. Such a system is efficient, easy to use, and very effective clinically.
Full article
(This article belongs to the Special Issue Tools and Techniques for Improving Radiological Imaging Applications)
►▼
Show Figures
Figure 1
Open AccessArticle
Enhanced Image Retrieval Using Multiscale Deep Feature Fusion in Supervised Hashing
by
Amina Belalia, Kamel Belloulata and Adil Redaoui
J. Imaging 2025, 11(1), 20; https://doi.org/10.3390/jimaging11010020 - 12 Jan 2025
Abstract
In recent years, deep-network-based hashing has gained prominence in image retrieval for its ability to generate compact and efficient binary representations. However, most existing methods predominantly focus on high-level semantic features extracted from the final layers of networks, often neglecting structural details that
[...] Read more.
In recent years, deep-network-based hashing has gained prominence in image retrieval for its ability to generate compact and efficient binary representations. However, most existing methods predominantly focus on high-level semantic features extracted from the final layers of networks, often neglecting structural details that are crucial for capturing spatial relationships within images. Achieving a balance between preserving structural information and maximizing retrieval accuracy is the key to effective image hashing and retrieval. To address this challenge, we introduce Multiscale Deep Feature Fusion for Supervised Hashing (MDFF-SH), a novel approach that integrates multiscale feature fusion into the hashing process. The hallmark of MDFF-SH lies in its ability to combine low-level structural features with high-level semantic context, synthesizing robust and compact hash codes. By leveraging multiscale features from multiple convolutional layers, MDFF-SH ensures the preservation of fine-grained image details while maintaining global semantic integrity, achieving a harmonious balance that enhances retrieval precision and recall. Our approach demonstrated a superior performance on benchmark datasets, achieving significant gains in the Mean Average Precision (MAP) compared with the state-of-the-art methods: 9.5% on CIFAR-10, 5% on NUS-WIDE, and 11.5% on MS-COCO. These results highlight the effectiveness of MDFF-SH in bridging structural and semantic information, setting a new standard for high-precision image retrieval through multiscale feature fusion.
Full article
(This article belongs to the Special Issue Recent Techniques in Image Feature Extraction)
►▼
Show Figures
Figure 1
Open AccessArticle
Efficient Generative-Adversarial U-Net for Multi-Organ Medical Image Segmentation
by
Haoran Wang, Gengshen Wu and Yi Liu
J. Imaging 2025, 11(1), 19; https://doi.org/10.3390/jimaging11010019 - 12 Jan 2025
Abstract
Manual labeling of lesions in medical image analysis presents a significant challenge due to its labor-intensive and inefficient nature, which ultimately strains essential medical resources and impedes the advancement of computer-aided diagnosis. This paper introduces a novel medical image-segmentation framework named Efficient Generative-Adversarial
[...] Read more.
Manual labeling of lesions in medical image analysis presents a significant challenge due to its labor-intensive and inefficient nature, which ultimately strains essential medical resources and impedes the advancement of computer-aided diagnosis. This paper introduces a novel medical image-segmentation framework named Efficient Generative-Adversarial U-Net (EGAUNet), designed to facilitate rapid and accurate multi-organ labeling. To enhance the model’s capability to comprehend spatial information, we propose the Global Spatial-Channel Attention Mechanism (GSCA). This mechanism enables the model to concentrate more effectively on regions of interest. Additionally, we have integrated Efficient Mapping Convolutional Blocks (EMCB) into the feature-learning process, allowing for the extraction of multi-scale spatial information and the adjustment of feature map channels through optimized weight values. Moreover, the proposed framework progressively enhances its performance by utilizing a generative-adversarial learning strategy, which contributes to improvements in segmentation accuracy. Consequently, EGAUNet demonstrates exemplary segmentation performance on public multi-organ datasets while maintaining high efficiency. For instance, in evaluations on the CHAOS T2SPIR dataset, EGAUNet achieves approximately higher performance on the Jaccard metric, higher on the Dice metric, and nearly higher on the precision metric in comparison to advanced networks such as Swin-Unet and TransUnet.
Full article
(This article belongs to the Special Issue Image Segmentation Techniques: Current Status and Future Directions (2nd Edition))
►▼
Show Figures
Figure 1
Open AccessArticle
Spectral Bidirectional Reflectance Distribution Function Simplification
by
Shubham Chitnis, Aditya Sole and Sharat Chandran
J. Imaging 2025, 11(1), 18; https://doi.org/10.3390/jimaging11010018 - 11 Jan 2025
Abstract
Non-diffuse materials (e.g., metallic inks, varnishes, and paints) are widely used in real-world applications. Accurate spectral rendering relies on the bidirectional reflectance distribution function (BRDF). Current methods of capturing the BRDFs have proven to be onerous in accomplishing quick turnaround time, from conception
[...] Read more.
Non-diffuse materials (e.g., metallic inks, varnishes, and paints) are widely used in real-world applications. Accurate spectral rendering relies on the bidirectional reflectance distribution function (BRDF). Current methods of capturing the BRDFs have proven to be onerous in accomplishing quick turnaround time, from conception and design to production. We propose a multi-layer perceptron for compact spectral material representations, with 31 wavelengths for four real-world packaging materials. Our neural-based scenario reduces measurement requirements while maintaining significant saliency. Unlike tristimulus BRDF acquisition, this spectral approach has not, to our knowledge, been previously explored with neural networks. We demonstrate compelling results for diffuse, glossy, and goniochromatic materials.
Full article
(This article belongs to the Special Issue Imaging Technologies for Understanding Material Appearance)
►▼
Show Figures
Figure 1
Open AccessArticle
Supervised and Self-Supervised Learning for Assembly Line Action Recognition
by
Christopher Indris, Fady Ibrahim, Hatem Ibrahem, Götz Bramesfeld, Jie Huo, Hafiz Mughees Ahmad, Syed Khizer Hayat and Guanghui Wang
J. Imaging 2025, 11(1), 17; https://doi.org/10.3390/jimaging11010017 - 10 Jan 2025
Abstract
The safety and efficiency of assembly lines are critical to manufacturing, but human supervisors cannot oversee all activities simultaneously. This study addresses this challenge by performing a comparative study to construct an initial real-time, semi-supervised temporal action recognition setup for monitoring worker actions
[...] Read more.
The safety and efficiency of assembly lines are critical to manufacturing, but human supervisors cannot oversee all activities simultaneously. This study addresses this challenge by performing a comparative study to construct an initial real-time, semi-supervised temporal action recognition setup for monitoring worker actions on assembly lines. Various feature extractors and localization models were benchmarked using a new assembly dataset, with the I3D model achieving an average mAP@IoU=0.1:0.7 of 85% without optical flow or fine-tuning. The comparative study was extended to self-supervised learning via a modified SPOT model, which achieved a mAP@IoU=0.1:0.7 of 65% with just 10% of the data labeled using extractor architectures from the fully-supervised portion. Milestones include high scores for both fully and semi-supervised learning on this dataset and improved SPOT performance on ANet1.3. This study identified the particularities of the problem, which were leveraged and referenced to explain the results observed in semi-supervised scenarios. The findings highlight the potential for developing a scalable solution in the future, providing labour efficiency and safety compliance for manufacturers.
Full article
(This article belongs to the Special Issue Advancing Action Recognition: Novel Approaches, Techniques and Applications)
►▼
Show Figures
Figure 1
Open AccessCommunication
Unmasking the Area Postrema on MRI: Utility of 3D FLAIR, 3D-T2, and 3D-DIR Sequences in a Case–Control Study
by
Javier Lara-García, Jessica Romo-Martínez, Jonathan Javier De-La-Cruz-Cisneros, Marco Antonio Olvera-Olvera and Luis Jesús Márquez-Bejarano
J. Imaging 2025, 11(1), 16; https://doi.org/10.3390/jimaging11010016 - 10 Jan 2025
Abstract
The area postrema (AP) is a key circumventricular organ involved in the regulation of autonomic functions. Accurate identification of the AP via MRI is essential in neuroimaging but it is challenging. This study evaluated 3D FSE Cube T2WI, 3D FSE Cube FLAIR, and
[...] Read more.
The area postrema (AP) is a key circumventricular organ involved in the regulation of autonomic functions. Accurate identification of the AP via MRI is essential in neuroimaging but it is challenging. This study evaluated 3D FSE Cube T2WI, 3D FSE Cube FLAIR, and 3D DIR sequences to improve AP detection in patients with and without multiple sclerosis (MS). A case–control study included 35 patients with MS and 35 with other non-demyelinating central nervous system diseases (ND-CNSD). MRI images were acquired employing 3D DIR, 3D FSE Cube FLAIR, and 3D FSE Cube T2WI sequences. The evaluation of AP was conducted using a 3-point scale. Statistical analysis was performed with the chi-square test used to assess group homogeneity and differences between sequences. No significant differences were found in the visualization of the AP between the MS and ND-CNSD groups across the sequences or planes. The AP was not visible in 27.6% of the 3D FSE Cube T2WI sequences, while it was visualized in 99% of the 3D FSE Cube FLAIR sequences and 100% of the 3D DIR sequences. The 3D DIR sequence showed superior performance in identifying the AP.
Full article
(This article belongs to the Section Medical Imaging)
►▼
Show Figures
Figure 1
Open AccessArticle
Skin Lesion Classification Through Test Time Augmentation and Explainable Artificial Intelligence
by
Loris Cino, Cosimo Distante, Alessandro Martella and Pier Luigi Mazzeo
J. Imaging 2025, 11(1), 15; https://doi.org/10.3390/jimaging11010015 - 9 Jan 2025
Abstract
Despite significant advancements in the automatic classification of skin lesions using artificial intelligence (AI) algorithms, skepticism among physicians persists. This reluctance is primarily due to the lack of transparency and explainability inherent in these models, which hinders their widespread acceptance in clinical settings.
[...] Read more.
Despite significant advancements in the automatic classification of skin lesions using artificial intelligence (AI) algorithms, skepticism among physicians persists. This reluctance is primarily due to the lack of transparency and explainability inherent in these models, which hinders their widespread acceptance in clinical settings. The primary objective of this study is to develop a highly accurate AI-based algorithm for skin lesion classification that also provides visual explanations to foster trust and confidence in these novel diagnostic tools. By improving transparency, the study seeks to contribute to earlier and more reliable diagnoses. Additionally, the research investigates the impact of Test Time Augmentation (TTA) on the performance of six Convolutional Neural Network (CNN) architectures, which include models from the EfficientNet, ResNet (Residual Network), and ResNeXt (an enhanced variant of ResNet) families. To improve the interpretability of the models’ decision-making processes, techniques such as t-distributed Stochastic Neighbor Embedding (t-SNE) and Gradient-weighted Class Activation Mapping (Grad-CAM) are employed. t-SNE is utilized to visualize the high-dimensional latent features of the CNNs in a two-dimensional space, providing insights into how the models group different skin lesion classes. Grad-CAM is used to generate heatmaps that highlight the regions of input images that influence the model’s predictions. Our findings reveal that Test Time Augmentation enhances the balanced multi-class accuracy of CNN models by up to 0.3%, achieving a balanced accuracy rate of 97.58% on the International Skin Imaging Collaboration (ISIC 2019) dataset. This performance is comparable to, or marginally better than, more complex approaches such as Vision Transformers (ViTs), demonstrating the efficacy of our methodology.
Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (2nd Edition))
►▼
Show Figures
Figure 1
Open AccessArticle
Face Boundary Formulation for Harmonic Models: Face Image Resembling
by
Hung-Tsai Huang, Zi-Cai Li, Yimin Wei and Ching Yee Suen
J. Imaging 2025, 11(1), 14; https://doi.org/10.3390/jimaging11010014 - 8 Jan 2025
Abstract
This paper is devoted to numerical algorithms based on harmonic transformations with two goals: (1) face boundary formulation by blending techniques based on the known characteristic nodes and (2) some challenging examples of face resembling. The formulation of the face boundary is imperative
[...] Read more.
This paper is devoted to numerical algorithms based on harmonic transformations with two goals: (1) face boundary formulation by blending techniques based on the known characteristic nodes and (2) some challenging examples of face resembling. The formulation of the face boundary is imperative for face recognition, transformation, and combination. Mapping between the source and target face boundaries with constituent pixels is explored by two approaches: cubic spline interpolation and ordinary differential equation (ODE) using Hermite interpolation. The ODE approach is more flexible and suitable for handling different boundary conditions, such as the clamped and simple support conditions. The intrinsic relations between the cubic spline and ODE methods are explored for different face boundaries, and their combinations are developed. Face combination and resembling are performed by employing blending curves for generating the face boundary, and face images are converted by numerical methods for harmonic models, such as the finite difference method (FDM), the finite element method (FEM) and the finite volume method (FVM) for harmonic models, and the splitting–integrating method (SIM) for the resampling of constituent pixels. For the second goal, the age effects of facial appearance are explored to discover that different ages of face images can be produced by integrating the photos and images of the old and the young. Then, the following challenging task is targeted. Based on the photos and images of parents and their children, can we obtain an integrated image to resemble his/her current image as closely as possible? Amazing examples of face combination and resembling are reported in this paper to give a positive answer. Furthermore, an optimal combination of face images of parents and their children in the least-squares sense is introduced to greatly facilitate face resembling. Face combination and resembling may also be used for plastic surgery, finding missing children, and identifying criminals. The boundary and numerical techniques of face images in this paper can be used not only for pattern recognition but also for face morphing, morphing attack detection (MAD), and computer animation as Sora to greatly enhance further developments in AI.
Full article
(This article belongs to the Special Issue Techniques and Applications in Face Image Analysis)
►▼
Show Figures
Figure 1
Open AccessArticle
Combined Input Deep Learning Pipeline for Embryo Selection for In Vitro Fertilization Using Light Microscopic Images and Additional Features
by
Krittapat Onthuam, Norrawee Charnpinyo, Kornrapee Suthicharoenpanich, Supphaset Engphaiboon, Punnarai Siricharoen, Ronnapee Chaichaowarat and Chanakarn Suebthawinkul
J. Imaging 2025, 11(1), 13; https://doi.org/10.3390/jimaging11010013 - 7 Jan 2025
Abstract
►▼
Show Figures
The current process of embryo selection in in vitro fertilization is based on morphological criteria; embryos are manually evaluated by embryologists under subjective assessment. In this study, a deep learning-based pipeline was developed to classify the viability of embryos using combined inputs, including
[...] Read more.
The current process of embryo selection in in vitro fertilization is based on morphological criteria; embryos are manually evaluated by embryologists under subjective assessment. In this study, a deep learning-based pipeline was developed to classify the viability of embryos using combined inputs, including microscopic images of embryos and additional features, such as patient age and developed pseudo-features, including a continuous interpretation of Istanbul grading scores by predicting the embryo stage, inner cell mass, and trophectoderm. For viability prediction, convolution-based transferred learning models were employed, multiple pretrained models were compared, and image preprocessing techniques and hyperparameter optimization via Optuna were utilized. In addition, a custom weight was trained using a self-supervised learning framework known as the Simple Framework for Contrastive Learning of Visual Representations (SimCLR) in cooperation with generated images using generative adversarial networks (GANs). The best model was developed from the EfficientNet-B0 model using preprocessed images combined with pseudo-features generated using separate EfficientNet-B0 models, and optimized by Optuna to tune the hyperparameters of the models. The designed model’s F1 score, accuracy, sensitivity, and area under curve (AUC) were 65.02%, 69.04%, 56.76%, and 66.98%, respectively. This study also showed an advantage in accuracy and a similar AUC when compared with the recent ensemble method.
Full article
Figure 1
Open AccessSystematic Review
Hybrid Quality-Based Recommender Systems: A Systematic Literature Review
by
Bihi Sabiri, Amal Khtira, Bouchra El Asri and Maryem Rhanoui
J. Imaging 2025, 11(1), 12; https://doi.org/10.3390/jimaging11010012 - 7 Jan 2025
Abstract
As technology develops, consumer behavior and how people search for what they want are constantly evolving. Online shopping has fundamentally changed the e-commerce industry. Although there are more products available than ever before, only a small portion of them are noticed; as a
[...] Read more.
As technology develops, consumer behavior and how people search for what they want are constantly evolving. Online shopping has fundamentally changed the e-commerce industry. Although there are more products available than ever before, only a small portion of them are noticed; as a result, a few items gain disproportionate attention. Recommender systems can help to increase the visibility of lesser-known products. Major technology businesses have adopted these technologies as essential offerings, resulting in better user experiences and more sales. As a result, recommender systems have achieved considerable economic, social, and global advancements. Companies are improving their algorithms with hybrid techniques that combine more recommendation methodologies as these systems are a major research focus. This review provides a thorough examination of several hybrid models by combining ideas from the current research and emphasizing their practical uses, strengths, and limits. The review identifies special problems and opportunities for designing and implementing hybrid recommender systems by focusing on the unique aspects of big data, notably volume, velocity, and variety. Adhering to the Cochrane Handbook and the principles developed by Kitchenham and Charters guarantees that the assessment process is transparent and high in quality. The current aim is to conduct a systematic review of several recent developments in the area of hybrid recommender systems. The study covers the state of the art of the relevant research over the last four years regarding four knowledge bases (ACM, Google Scholar, Scopus, and Springer), as well as all Web of Science articles regardless of their date of publication. This study employs ASReview, an open-source application that uses active learning to help academics filter literature efficiently. This study aims to assess the progress achieved in the field of hybrid recommender systems to identify frequently used recommender approaches, explore the technical context, highlight gaps in the existing research, and position our future research in relation to the current studies.
Full article
(This article belongs to the Section Document Analysis and Processing)
►▼
Show Figures
Figure 1
Open AccessArticle
Application of Generative Artificial Intelligence Models for Accurate Prescription Label Identification and Information Retrieval for the Elderly in Northern East of Thailand
by
Parinya Thetbanthad, Benjaporn Sathanarugsawait and Prasong Praneetpolgrang
J. Imaging 2025, 11(1), 11; https://doi.org/10.3390/jimaging11010011 - 6 Jan 2025
Abstract
This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct
[...] Read more.
This study introduces a novel AI-driven approach to support elderly patients in Thailand with medication management, focusing on accurate drug label interpretation. Two model architectures were explored: a Two-Stage Optical Character Recognition (OCR) and Large Language Model (LLM) pipeline combining EasyOCR with Qwen2-72b-instruct and a Uni-Stage Visual Question Answering (VQA) model using Qwen2-72b-VL. Both models operated in a zero-shot capacity, utilizing Retrieval-Augmented Generation (RAG) with DrugBank references to ensure contextual relevance and accuracy. Performance was evaluated on a dataset of 100 diverse prescription labels from Thai healthcare facilities, using RAG Assessment (RAGAs) metrics to assess Context Recall, Factual Correctness, Faithfulness, and Semantic Similarity. The Two-Stage model achieved high accuracy (94%) and strong RAGAs scores, particularly in Context Recall (0.88) and Semantic Similarity (0.91), making it well-suited for complex medication instructions. In contrast, the Uni-Stage model delivered faster response times, making it practical for high-volume environments such as pharmacies. This study demonstrates the potential of zero-shot AI models in addressing medication management challenges for the elderly by providing clear, accurate, and contextually relevant label interpretations. The findings underscore the adaptability of AI in healthcare, balancing accuracy and efficiency to meet various real-world needs.
Full article
(This article belongs to the Section AI in Imaging)
►▼
Show Figures
Figure 1
Open AccessArticle
A New Deep Learning-Based Method for Automated Identification of Thoracic Lymph Node Stations in Endobronchial Ultrasound (EBUS): A Proof-of-Concept Study
by
Øyvind Ervik, Mia Rødde, Erlend Fagertun Hofstad, Ingrid Tveten, Thomas Langø, Håkon O. Leira, Tore Amundsen and Hanne Sorger
J. Imaging 2025, 11(1), 10; https://doi.org/10.3390/jimaging11010010 - 5 Jan 2025
Abstract
Endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA) is a cornerstone in minimally invasive thoracic lymph node sampling. In lung cancer staging, precise assessment of lymph node position is crucial for clinical decision-making. This study aimed to demonstrate a new deep learning method to classify
[...] Read more.
Endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA) is a cornerstone in minimally invasive thoracic lymph node sampling. In lung cancer staging, precise assessment of lymph node position is crucial for clinical decision-making. This study aimed to demonstrate a new deep learning method to classify thoracic lymph nodes based on their anatomical location using EBUS images. Bronchoscopists labeled lymph node stations in real-time according to the Mountain Dressler nomenclature. EBUS images were then used to train and test a deep neural network (DNN) model, with intraoperative labels as ground truth. In total, 28,134 EBUS images were acquired from 56 patients. The model achieved an overall classification accuracy of 59.5 ± 5.2%. The highest precision, sensitivity, and F1 score were observed in station 4L, 77.6 ± 13.1%, 77.6 ± 15.4%, and 77.6 ± 15.4%, respectively. The lowest precision, sensitivity, and F1 score were observed in station 10L. The average processing and prediction time for a sequence of ten images was 0.65 ± 0.04 s, demonstrating the feasibility of real-time applications. In conclusion, the new DNN-based model could be used to classify lymph node stations from EBUS images. The method performance was promising with a potential for clinical use.
Full article
(This article belongs to the Special Issue Advances in Medical Imaging and Machine Learning)
►▼
Show Figures
Figure 1
Open AccessArticle
Visual Impairment Spatial Awareness System for Indoor Navigation and Daily Activities
by
Xinrui Yu and Jafar Saniie
J. Imaging 2025, 11(1), 9; https://doi.org/10.3390/jimaging11010009 - 4 Jan 2025
Abstract
The integration of artificial intelligence into daily life significantly enhances the autonomy and quality of life of visually impaired individuals. This paper introduces the Visual Impairment Spatial Awareness (VISA) system, designed to holistically assist visually impaired users in indoor activities through a structured,
[...] Read more.
The integration of artificial intelligence into daily life significantly enhances the autonomy and quality of life of visually impaired individuals. This paper introduces the Visual Impairment Spatial Awareness (VISA) system, designed to holistically assist visually impaired users in indoor activities through a structured, multi-level approach. At the foundational level, the system employs augmented reality (AR) markers for indoor positioning, neural networks for advanced object detection and tracking, and depth information for precise object localization. At the intermediate level, it integrates data from these technologies to aid in complex navigational tasks such as obstacle avoidance and pathfinding. The advanced level synthesizes these capabilities to enhance spatial awareness, enabling users to navigate complex environments and locate specific items. The VISA system exhibits an efficient human–machine interface (HMI), incorporating text-to-speech and speech-to-text technologies for natural and intuitive communication. Evaluations in simulated real-world environments demonstrate that the system allows users to interact naturally and with minimal effort. Our experimental results confirm that the VISA system efficiently assists visually impaired users in indoor navigation, object detection and localization, and label and text recognition, thereby significantly enhancing their spatial awareness and independence.
Full article
(This article belongs to the Special Issue Image and Video Processing for Blind and Visually Impaired)
►▼
Show Figures
Figure 1
Open AccessArticle
Enhanced CATBraTS for Brain Tumour Semantic Segmentation
by
Rim El Badaoui, Ester Bonmati Coll, Alexandra Psarrou, Hykoush A. Asaturyan and Barbara Villarini
J. Imaging 2025, 11(1), 8; https://doi.org/10.3390/jimaging11010008 - 3 Jan 2025
Abstract
The early and precise identification of a brain tumour is imperative for enhancing a patient’s life expectancy; this can be facilitated by quick and efficient tumour segmentation in medical imaging. Automatic brain tumour segmentation tools in computer vision have integrated powerful deep learning
[...] Read more.
The early and precise identification of a brain tumour is imperative for enhancing a patient’s life expectancy; this can be facilitated by quick and efficient tumour segmentation in medical imaging. Automatic brain tumour segmentation tools in computer vision have integrated powerful deep learning architectures to enable accurate tumour boundary delineation. Our study aims to demonstrate improved segmentation accuracy and higher statistical stability, using datasets obtained from diverse imaging acquisition parameters. This paper introduces a novel, fully automated model called Enhanced Channel Attention Transformer (E-CATBraTS) for Brain Tumour Semantic Segmentation; this model builds upon 3D CATBraTS, a vision transformer employed in magnetic resonance imaging (MRI) brain tumour segmentation tasks. E-CATBraTS integrates convolutional neural networks and Swin Transformer, incorporating channel shuffling and attention mechanisms to effectively segment brain tumours in multi-modal MRI. The model was evaluated on four datasets containing 3137 brain MRI scans. Through the adoption of E-CATBraTS, the accuracy of the results improved significantly on two datasets, outperforming the current state-of-the-art models by a mean DSC of 2.6% while maintaining a high accuracy that is comparable to the top-performing models on the other datasets. The results demonstrate that E-CATBraTS achieves both high segmentation accuracy and elevated generalisation abilities, ensuring the model is robust to dataset variation.
Full article
(This article belongs to the Special Issue Advances in Medical Imaging and Machine Learning)
►▼
Show Figures
Figure 1
Open AccessArticle
Fitting Geometric Shapes to Fuzzy Point Cloud Data
by
Vincent B. Verhoeven, Pasi Raumonen and Markku Åkerblom
J. Imaging 2025, 11(1), 7; https://doi.org/10.3390/jimaging11010007 - 3 Jan 2025
Abstract
This article describes procedures and thoughts regarding the reconstruction of geometry-given data and its uncertainty. The data are considered as a continuous fuzzy point cloud, instead of a discrete point cloud. Shape fitting is commonly performed by minimizing the discrete Euclidean distance; however,
[...] Read more.
This article describes procedures and thoughts regarding the reconstruction of geometry-given data and its uncertainty. The data are considered as a continuous fuzzy point cloud, instead of a discrete point cloud. Shape fitting is commonly performed by minimizing the discrete Euclidean distance; however, we propose the novel approach of using the expected Mahalanobis distance. The primary benefit is that it takes both the different magnitude and orientation of uncertainty for each data point into account. We illustrate the approach with laser scanning data of a cylinder and compare its performance with that of the conventional least squares method with and without random sample consensus (RANSAC). Our proposed method fits the geometry more accurately, albeit generally with greater uncertainty, and shows promise for geometry reconstruction with laser-scanned data.
Full article
(This article belongs to the Special Issue Geometry Reconstruction from Images (2nd Edition))
►▼
Show Figures
Figure 1
Open AccessArticle
Exploring Multi-Pathology Brain Segmentation: From Volume-Based to Component-Based Deep Learning Analysis
by
Ioannis Stathopoulos, Roman Stoklasa, Maria Anthi Kouri, Georgios Velonakis, Efstratios Karavasilis, Efstathios Efstathopoulos and Luigi Serio
J. Imaging 2025, 11(1), 6; https://doi.org/10.3390/jimaging11010006 - 31 Dec 2024
Abstract
Detection and segmentation of brain abnormalities using Magnetic Resonance Imaging (MRI) is an important task that, nowadays, the role of AI algorithms as supporting tools is well established both at the research and clinical-production level. While the performance of the state-of-the-art models is
[...] Read more.
Detection and segmentation of brain abnormalities using Magnetic Resonance Imaging (MRI) is an important task that, nowadays, the role of AI algorithms as supporting tools is well established both at the research and clinical-production level. While the performance of the state-of-the-art models is increasing, reaching radiologists and other experts’ accuracy levels in many cases, there is still a lot of research needed on the direction of in-depth and transparent evaluation of the correct results and failures, especially in relation to important aspects of the radiological practice: abnormality position, intensity level, and volume. In this work, we focus on the analysis of the segmentation results of a pre-trained U-net model trained and validated on brain MRI examinations containing four different pathologies: Tumors, Strokes, Multiple Sclerosis (MS), and White Matter Hyperintensities (WMH). We present the segmentation results for both the whole abnormal volume and for each abnormal component inside the examinations of the validation set. In the first case, a dice score coefficient (DSC), sensitivity, and precision of 0.76, 0.78, and 0.82, respectively, were found, while in the second case the model detected and segmented correct (True positives) the 48.8% (DSC ≥ 0.5) of abnormal components, partially correct the 27.1% (0.05 > DSC > 0.5), and missed (False Negatives) the 24.1%, while it produced 25.1% False Positives. Finally, we present an extended analysis between the True positives, False Negatives, and False positives versus their position inside the brain, their intensity at three MRI modalities (FLAIR, T2, and T1ce) and their volume.
Full article
(This article belongs to the Special Issue Image Segmentation Techniques: Current Status and Future Directions (2nd Edition))
►▼
Show Figures
Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Future Internet, Information, J. Imaging, Mathematics, Symmetry
Research on Deep Neural Networks for Video Motion Recognition
Topic Editors: Hamad Naeem, Hong Su, Amjad Alsirhani, Muhammad Shoaib BhuttaDeadline: 31 January 2025
Topic in
Applied Sciences, Computers, Electronics, Information, J. Imaging
Visual Computing and Understanding: New Developments and Trends
Topic Editors: Wei Zhou, Guanghui Yue, Wenhan YangDeadline: 30 March 2025
Topic in
Applied Sciences, Computation, Entropy, J. Imaging, Optics
Color Image Processing: Models and Methods (CIP: MM)
Topic Editors: Giuliana Ramella, Isabella TorcicolloDeadline: 30 July 2025
Topic in
Applied Sciences, Bioengineering, Diagnostics, J. Imaging, Signals
Signal Analysis and Biomedical Imaging for Precision Medicine
Topic Editors: Surbhi Bhatia Khan, Mo SaraeeDeadline: 31 August 2025
Conferences
Special Issues
Special Issue in
J. Imaging
Recent Trends in Computer Vision with Neural Networks
Guest Editor: Mario MolinaraDeadline: 30 January 2025
Special Issue in
J. Imaging
Image Processing with Embedded Systems and FPGAs: AI Methods and IoT Applications
Guest Editors: Rui Policarpo Duarte, Paulo FloresDeadline: 31 January 2025
Special Issue in
J. Imaging
Imaging Technologies for Understanding Material Appearance
Guest Editors: Shoji Tominaga, Takahiko HoriuchiDeadline: 31 January 2025
Special Issue in
J. Imaging
Advances in Computational Imaging: Challenges and Future Directions
Guest Editors: Chenchu Xu, Rongjun Ge, Jinglin ZhangDeadline: 31 January 2025