Search | arXiv e-print repository

Large-Language-Model Empowered Dose Volume Histogram Prediction for Intensity Modulated Radiotherapy

Authors: Zehao Dong, Yixin Chen, Hiram Gay, Yao Hao, Geoffrey D. Hugo, Pamela Samson, Tianyu Zhao

Abstract: Treatment planning is currently a patient specific, time-consuming, and resource demanding task in radiotherapy. Dose-volume histogram (DVH) prediction plays a critical role in automating this process. The geometric relationship between DVHs in radiotherapy plans and organs-at-risk (OAR) and planning target volume (PTV) has been well established. This study explores the potential of deep learning… ▽ More Treatment planning is currently a patient specific, time-consuming, and resource demanding task in radiotherapy. Dose-volume histogram (DVH) prediction plays a critical role in automating this process. The geometric relationship between DVHs in radiotherapy plans and organs-at-risk (OAR) and planning target volume (PTV) has been well established. This study explores the potential of deep learning models for predicting DVHs using images and subsequent human intervention facilitated by a large-language model (LLM) to enhance the planning quality. We propose a pipeline to convert unstructured images to a structured graph consisting of image-patch nodes and dose nodes. A novel Dose Graph Neural Network (DoseGNN) model is developed for predicting DVHs from the structured graph. The proposed DoseGNN is enhanced with the LLM to encode massive knowledge from prescriptions and interactive instructions from clinicians. In this study, we introduced an online human-AI collaboration (OHAC) system as a practical implementation of the concept proposed for the automation of intensity-modulated radiotherapy (IMRT) planning. In comparison to the widely-employed DL models used in radiotherapy, DoseGNN achieved mean square errors that were 80$\%$, 76$\%$ and 41.0$\%$ of those predicted by Swin U-Net Transformer, 3D U-Net CNN and vanilla MLP, respectively. Moreover, the LLM-empowered DoseGNN model facilitates seamless adjustment to treatment plans through interaction with clinicians using natural language. △ Less

Submitted 11 February, 2024; originally announced February 2024.

arXiv:2206.11501 [pdf, other]

A novel adversarial learning strategy for medical image classification

Authors: Zong Fan, Xiaohui Zhang, Jacob A. Gasienica, Jennifer Potts, Su Ruan, Wade Thorstad, Hiram Gay, Pengfei Song, Xiaowei Wang, Hua Li

Abstract: Deep learning (DL) techniques have been extensively utilized for medical image classification. Most DL-based classification networks are generally structured hierarchically and optimized through the minimization of a single loss function measured at the end of the networks. However, such a single loss design could potentially lead to optimization of one specific value of interest but fail to lever… ▽ More Deep learning (DL) techniques have been extensively utilized for medical image classification. Most DL-based classification networks are generally structured hierarchically and optimized through the minimization of a single loss function measured at the end of the networks. However, such a single loss design could potentially lead to optimization of one specific value of interest but fail to leverage informative features from intermediate layers that might benefit classification performance and reduce the risk of overfitting. Recently, auxiliary convolutional neural networks (AuxCNNs) have been employed on top of traditional classification networks to facilitate the training of intermediate layers to improve classification performance and robustness. In this study, we proposed an adversarial learning-based AuxCNN to support the training of deep neural networks for medical image classification. Two main innovations were adopted in our AuxCNN classification framework. First, the proposed AuxCNN architecture includes an image generator and an image discriminator for extracting more informative image features for medical image classification, motivated by the concept of generative adversarial network (GAN) and its impressive ability in approximating target data distribution. Second, a hybrid loss function is designed to guide the model training by incorporating different objectives of the classification network and AuxCNN to reduce overfitting. Comprehensive experimental studies demonstrated the superior classification performance of the proposed model. The effect of the network-related factors on classification performance was investigated. △ Less

Submitted 7 July, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

arXiv:2107.04847 [pdf]

doi 10.1002/mp.15287

Weaving Attention U-net: A Novel Hybrid CNN and Attention-based Method for Organs-at-risk Segmentation in Head and Neck CT Images

Authors: Zhuangzhuang Zhang, Tianyu Zhao, Hiram Gay, Weixiong Zhang, Baozhou Sun

Abstract: In radiotherapy planning, manual contouring is labor-intensive and time-consuming. Accurate and robust automated segmentation models improve the efficiency and treatment outcome. We aim to develop a novel hybrid deep learning approach, combining convolutional neural networks (CNNs) and the self-attention mechanism, for rapid and accurate multi-organ segmentation on head and neck computed tomograph… ▽ More In radiotherapy planning, manual contouring is labor-intensive and time-consuming. Accurate and robust automated segmentation models improve the efficiency and treatment outcome. We aim to develop a novel hybrid deep learning approach, combining convolutional neural networks (CNNs) and the self-attention mechanism, for rapid and accurate multi-organ segmentation on head and neck computed tomography (CT) images. Head and neck CT images with manual contours of 115 patients were retrospectively collected and used. We set the training/validation/testing ratio to 81/9/25 and used the 10-fold cross-validation strategy to select the best model parameters. The proposed hybrid model segmented ten organs-at-risk (OARs) altogether for each case. The performance of the model was evaluated by three metrics, i.e., the Dice Similarity Coefficient (DSC), Hausdorff distance 95% (HD95), and mean surface distance (MSD). We also tested the performance of the model on the Head and Neck 2015 challenge dataset and compared it against several state-of-the-art automated segmentation algorithms. The proposed method generated contours that closely resemble the ground truth for ten OARs. Our results of the new Weaving Attention U-net demonstrate superior or similar performance on the segmentation of head and neck CT images. △ Less

Submitted 22 September, 2021; v1 submitted 10 July, 2021; originally announced July 2021.

Comments: 12 pages, 5 figures

arXiv:2009.09571 [pdf]

Semi-supervised Semantic Segmentation of Prostate and Organs-at-Risk on 3D Pelvic CT Images

Authors: Zhuangzhuang Zhang, Tianyu Zhao, Hiram Gay, Baozhou Sun, Weixiong Zhang

Abstract: Automated segmentation can assist radiotherapy treatment planning by saving manual contouring efforts and reducing intra-observer and inter-observer variations. The recent development of deep learning approaches has revoluted medical data processing, including semantic segmentation, by dramatically improving performance. However, training effective deep learning models usually require a large amou… ▽ More Automated segmentation can assist radiotherapy treatment planning by saving manual contouring efforts and reducing intra-observer and inter-observer variations. The recent development of deep learning approaches has revoluted medical data processing, including semantic segmentation, by dramatically improving performance. However, training effective deep learning models usually require a large amount of high-quality labeled data, which are often costly to collect. We developed a novel semi-supervised adversarial deep learning approach for 3D pelvic CT image semantic segmentation. Unlike supervised deep learning methods, the new approach can utilize both annotated and un-annotated data for training. It generates un-annotated synthetic data by a data augmentation scheme using generative adversarial networks (GANs). We applied the new approach to segmenting multiple organs in male pelvic CT images, where CT images without annotations and GAN-synthesized un-annotated images were used in semi-supervised learning. Experimental results, evaluated by three metrics (Dice similarity coefficient, average Hausdorff distance, and average surface Hausdorff distance), showed that the new method achieved either comparable performance with substantially fewer annotated images or better performance with the same amount of annotated data, outperforming the existing state-of-the-art methods. △ Less

Submitted 11 September, 2021; v1 submitted 20 September, 2020; originally announced September 2020.

Comments: 16 pages, 4 figures

arXiv:2008.04488 [pdf]

doi 10.1002/mp.14580

ARPM-net: A novel CNN-based adversarial method with Markov Random Field enhancement for prostate and organs at risk segmentation in pelvic CT images

Authors: Zhuangzhuang Zhang, Tianyu Zhao, Hiram Gay, Weixiong Zhang, Baozhou Sun

Abstract: Purpose: The research is to develop a novel CNN-based adversarial deep learning method to improve and expedite the multi-organ semantic segmentation of CT images, and to generate accurate contours on pelvic CT images. Methods: Planning CT and structure datasets for 120 patients with intact prostate cancer were retrospectively selected and divided for 10-fold cross-validation. The proposed adversar… ▽ More Purpose: The research is to develop a novel CNN-based adversarial deep learning method to improve and expedite the multi-organ semantic segmentation of CT images, and to generate accurate contours on pelvic CT images. Methods: Planning CT and structure datasets for 120 patients with intact prostate cancer were retrospectively selected and divided for 10-fold cross-validation. The proposed adversarial multi-residual multi-scale pooling Markov Random Field (MRF) enhanced network (ARPM-net) implements an adversarial training scheme. A segmentation network and a discriminator network were trained jointly, and only the segmentation network was used for prediction. The segmentation network integrates a newly designed MRF block into a variation of multi-residual U-net. The discriminator takes the product of the original CT and the prediction/ground-truth as input and classifies the input into fake/real. The segmentation network and discriminator network can be trained jointly as a whole, or the discriminator can be used for fine-tuning after the segmentation network is coarsely trained. Multi-scale pooling layers were introduced to preserve spatial resolution during pooling using less memory compared to atrous convolution layers. An adaptive loss function was proposed to enhance the training on small or low contrast organs. The accuracy of modeled contours was measured with the Dice similarity coefficient (DSC), Average Hausdorff Distance (AHD), Average Surface Hausdorff Distance (ASHD), and relative Volume Difference (VD) using clinical contours as references to the ground-truth. The proposed ARPM-net method was compared to several stateof-the-art deep learning methods. △ Less

Submitted 17 September, 2020; v1 submitted 10 August, 2020; originally announced August 2020.

Comments: 21 pages, 8 figures; accepted as a journal article at Medical Physics; abstract presented at AAPM 2020

MSC Class: 68T07(Primary); 68T45(Secondary)

Showing 1–5 of 5 results for author: Gay, H