Project1 Report
Project1 Report
2
Contents
1 Introduction 4
1.1 CT Denoising Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Motivation and Project Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Results 7
3.1 CycleGAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.1 Loss Curve Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1.2 Qualitative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.1.3 Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Noise2Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.1 Loss Curve Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.2 Qualitative Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2.3 Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Discussion 10
3
AI618 Project 1 Hoyeol Sohn (20244279)
1 Introduction
1.1 CT Denoising Problem
Computed Tomography (CT) imaging provides crucial cross-sectional views of the human body for
medical diagnosis. However, CT utilizes ionizing radiation, posing potential health risks. Reducing the
radiation dose is desirable but inherently increases image noise and reduces contrast. Low-Dose CT
(LDCT) denoising aims to computationally remove this noise, recovering an image quality comparable
to that of a full-dose CT (FDCT) scan.
1. CycleGAN: An unsupervised method using unpaired LDCT (Quarter-dose) and FDCT images
[1].
2. Noise2Score: A self-supervised method using only FDCT images to learn to remove Poisson
noise [2].
We implement these methods based on course skeletons [3] using the AAPM dataset [4] and evaluate
their performance.
4. Normalize:
4
AI618 Project 1 Hoyeol Sohn (20244279)
Ladv (GX→Y , DY ) = Ey∼PY [(DY (y) − 1)2 ] + Ex∼PX [(DY (GX→Y (x)))2 ] (1)
2
Ladv (GX→Y ) = Ex∼PX [(DY (GX→Y (x)) − 1) ] (2)
• Cycle Consistency Loss: Encourages GY →X (GX→Y (x)) ≈ x and GX→Y (GY →X (y)) ≈ y.
Implemented using nn.L1Loss(). Weighted by λcycle = 10.
Lcyc (GX→Y , GY →X ) = Ex∼PX [∥GY →X (GX→Y (x))−x∥1 ]+Ey∼PY [∥GX→Y (GY →X (y))−y∥1 ]
(3)
• Identity Loss: Encourages GX→Y (y) ≈ y and GY →X (x) ≈ x. Implemented using nn.L1Loss().
Weighted by λiden = 5.
Lidt (GX→Y , GY →X ) = Ey∼PY [∥GX→Y (y) − y∥1 ] + Ex∼PX [∥GY →X (x) − x∥1 ] (4)
Training Loop: Implemented by filling the placeholders in the skeleton notebook. Key steps per itera-
tion included:
2. Calculating generator outputs (xF Q , xQF ) and reconstructed outputs (xF QF , xQF Q ).
5
AI618 Project 1 Hoyeol Sohn (20244279)
5. Backpropagating total generator loss LG and stepping the generator optimizer (G optim).
6. Calculating discriminator losses LDF , LDQ using real images and fake images retrieved from
a replay buffer (fake A buffer, fake B buffer), ensuring .detach() is called on fake
images fed to the discriminators.
7. Backpropagating discriminator losses (LDF and LDQ separately) and stepping the discriminator
optimizer (D optim).
Optimizer/Scheduler: Adam optimizer was used (β1 = 0.5, β2 = 0.999, lr = 2 × 10−4 ) with a
WarmupLinearLR scheduler (1000 warmup steps, followed by linear decay).
Hyperparameters: Batch size = 16, Epochs = 100, λcycle = 10, λiden = 5.
The network takes a noisy image ynoisy and outputs the score estimate ℓ̂′ (ynoisy ) ≈ RΘ (ynoisy ).
• Training Loop: Implemented within the skeleton notebook. Key steps per iteration:
• Hyperparameters: Batch size = 32, Epochs = 100, η = 0.01 (for Poisson inference).
6
AI618 Project 1 Hoyeol Sohn (20244279)
2.3.3 Inference
The inference step involves applying Tweedie’s formula for Poisson noise (η = 0.01) using the trained
network:
1 # Inside inference loop:
2 # y = x_F_noisy (Poisson noise added with eta=0.01)
3 l_hat = ARDAE(y) # Get score estimate
4 # Apply Tweedie’s formula for Poisson:
5 x_F_recon = (y + eta / 2.0) * torch.exp(eta * l_hat)
The output x F recon is then post-processed (offset removed, scaled back to HU).
3 Results
3.1 CycleGAN
3.1.1 Loss Curve Analysis
Figure 1 shows the adversarial losses. The generator losses (Gadv F , Gadv Q ) and discriminator losses
(Dadv F , Dadv Q ) exhibit some instability in the initial epochs (approx. 0-20), which is common in
GAN training. Subsequently, they converge and stabilize around the theoretical equilibrium of 0.25 for
LSGAN [6]. The losses remain stable with minimal oscillations in the later epochs, indicating successful
convergence of the adversarial training process. The final average losses (Dadv ≈ 0.248, Gadv ≈ 0.255)
are very close to the equilibrium.
Figure 2 shows the cycle-consistency and identity losses. Both types of losses decrease rapidly
in the early epochs and continue to decrease steadily throughout training, converging to low values
(Gcycle ≈ 0.007 − 0.011, Giden ≈ 0.006 − 0.009). This demonstrates that the generators are effectively
learning the structural mapping constraints imposed by these losses.
Figure 1: CycleGAN Adversarial Losses. Showing Gadv F , Gadv Q , Dadv F , Dadv Q vs. epochs.
7
AI618 Project 1 Hoyeol Sohn (20244279)
Figure 2: CycleGAN Cycle and Identity Losses. Showing Gcycle F , Gcycle Q , Giden F , Giden Q vs.
epochs.
Figure 3: CycleGAN Qualitative Results Example. Left: Ground Truth Full-dose CT. Middle: Input
Quarter-dose CT (PSNR 31.64 / SSIM 0.859). Right: CycleGAN Output (PSNR 35.06 / SSIM 0.908).
8
AI618 Project 1 Hoyeol Sohn (20244279)
While demonstrating clear improvement and stable training convergence, the CycleGAN implementa-
tion under these settings did not meet the stringent quantitative targets for full credit. The performance
gap might be attributed to the limited network capacity (ngf = 32, 6 blocks) chosen for faster training
or the inherent complexity of unpaired translation.
3.2 Noise2Score
3.2.1 Loss Curve Analysis
Figure 4 displays the Noise2Score training loss (LAR−DAE ). The loss shows a characteristic sharp
decrease in the initial epochs, followed by stable convergence towards a low value (≈ 0.3). This behavior
indicates that the AR-DAE network effectively learned to estimate the score function required by the
training objective.
Figure 4: Noise2Score Training Loss (loss score). Showing loss vs. epochs.
9
AI618 Project 1 Hoyeol Sohn (20244279)
Figure 5: Noise2Score Qualitative Results Example. Left: Ground Truth (Clean Full-dose). Middle:
Input (Full-dose + Poisson Noise, η = 0.01, PSNR 30.19 / SSIM 0.864). Right: Noise2Score Output
(PSNR 33.14 / SSIM 0.918).
Table 2: Noise2Score Quantitative Evaluation (Average over Test Set, Poisson Noise η = 0.01)
The Noise2Score implementation successfully met both quantitative criteria for full credit, demonstrat-
ing its effectiveness for self-supervised denoising when the noise model is known.
4 Discussion
This project implemented and evaluated CycleGAN and Noise2Score for CT denoising under unsuper-
vised and self-supervised settings, respectively.
The CycleGAN implementation successfully learned the domain translation task, as evidenced by
the stable convergence of adversarial and cycle-consistency losses and clear visual noise reduction.
However, the final quantitative performance (2.85 dB PSNR gain, 0.927 SSIM) fell short of the high tar-
gets set by the evaluation guideline (3.5 dB, 0.95 SSIM). This suggests that while CycleGAN provides a
viable framework for unsupervised denoising, achieving top-tier quantitative results might require larger
network capacity, longer training times, or potentially more advanced cycle-consistency formulations for
this specific medical imaging task.
The Noise2Score implementation effectively addressed the self-supervised Poisson denoising task.
The AR-DAE network successfully learned the score function from clean data, and applying Tweedie’s
formula for Poisson noise yielded significant improvements, meeting the quantitative targets (3.06 dB
PSNR gain, 0.853 SSIM). This confirms the theoretical soundness and practical utility of the Noise2Score
framework for cases where the noise distribution is known. The results highlight the power of score-
based methods in self-supervised learning.
In conclusion, both methods offer valuable pathways for CT denoising without paired data. Cycle-
GAN addresses the direct domain translation problem but may face challenges in reaching the highest
quantitative benchmarks without extensive tuning. Noise2Score excels at removing specific noise types
by learning from clean data but relies on knowing the noise characteristics for inference. Noise2Score
met the performance criteria for this project’s specific task.
10
AI618 Project 1 Hoyeol Sohn (20244279)
References
[1] Jun-Yan Zhu et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Net-
works”. In: arXiv preprint arXiv:1703.10593 (2017). arXiv: 1703.10593 [cs.CV].
[2] Kwanyoung Kim and Jong Chul Ye. “Noise2Score: Tweedie’s Approach to Self-Supervised Image
Denoising without Clean Images”. In: arXiv preprint arXiv:2106.07009 (2021). arXiv: 2106 .
07009 [eess.IV].
[3] AI618 Teaching Staff. [AI618]project1 evaluation guideline.pdf. Course Material. 2025.
[4] American Association of Physicists in Medicine. Low Dose CT Grand Challenge. https : / /
www.aapm.org/GrandChallenge/LowDoseCT/. Accessed: May 20, 2025.
[5] Justin Johnson, Alexandre Alahi, and Li Fei-Fei. “Perceptual losses for real-time style transfer and
super-resolution”. In: European conference on computer vision. Springer. 2016, pp. 694–711.
[6] Xudong Mao et al. “Least squares generative adversarial networks”. In: Proceedings of the IEEE
international conference on computer vision. 2017, pp. 2794–2802.
11