Fig. 1. Overview of the proposed network TESLA. In the first stage, we progressively reconstruct LR Tar. In the second stage, we leverage the high-quality content information disentangled from HR Ref with the pre-trained ContentNet to enrich the structural fine detail of Soft SR Tar with the patch-wise contrastive learning. HR Ref and ContentNet are used only during training; inference requires only PR, the encoders (\(E_c\) and \(E_s\)), and decoder (\(G\)) in SE.
Through-plane super-resolution (SR) in brain magnetic resonance imaging (MRI) is clinically important during clinical assessments. Most existing multi-contrast SR models mainly focus on enhancing inplane image resolution, relying on functions already integrated into MRI scanners. These methods usually leverage proprietary fusion techniques to integrate multi-contrast images, resulting in diminished interpretability. Furthermore, the requirement for reference images during testing limits their applicability in clinical settings. We propose a TEst time reference-free through-plane Super-resoLution network using disentAngled representation learning in multi-contrast MRI (TESLA) to address these challenges. Our method is developed on the premise that multicontrast images consist of shared content (structure) and independent stylistic (contrast) features. Thus, after progressively reconstructing the target image in the first stage, we divide it into shared and independent elements during the structure enhancement phase. In this stage, we employ a pre-trained ContentNet to effectively disentangle high-quality structural information from the reference image, enabling the shared components of the target image to learn directly from those of the reference image through patch-wise contrastive learning during training. Consequently, the proposed model enhances clinical applicability while ensuring model interpretability. Extensive experimental results demonstrate that the proposed model performs favorably against other state-ofthe- art multi-contrast SR models, especially in restoring structural fine details in the through-plane direction.
Fig. 2. Qualitative comparison results of through-plane SR on IXI and HCP dataset.
Fig. 3. Qualitative comparison results of pseudo-vessel reconstruction on in-house dataset when the scaling factor is ×4.
Fig. 4. Qualitative results of the ablation study on the loss combinations in ContentNet on the IXI dataset, which effectively decomposes high-quality structural information from HR Ref. First row: MUNIT (L1 + Perceptual + Adversarial) + LSGAN, Second row: MUNIT (L1 + Perceptual + Adversarial) + PatchGAN, and Third row: MUNIT (L1 + SSIM + Adversarial) + PatchGAN. \(c^{i}_{HR T1} (i = 1, 2, 3)\) indicates randomly selected content information decomposed from HR T1 on each condition.
Fig. 5. Qualitative results of the ablation study analyzing the optimal HR Ref on the IXI dataset when the scaling factor is ×4. \(x^{enhan}_{SRT2}\) means the final output of the proposed network on each condition. \(c^{i}_{SRT2} (i = 1, 2, 3)\) indicates randomly selected content information decomposed from SR T2 on each condition. \(x_{ref}\) denotes HD Ref.
@article{choi2025tesla,
author = {Yoonseok Choi and Sunyoung Jung and Mohammed A. Al-masni and Ming-Hsuan Yang and Dong-Hyun Kim},
title = {TESLA: Test-time Reference-free Through-plane Super-resolution for Multi-contrast Brain MRI},
booktitle = {International Conference on Medical Image Computing and Computer-Assisted Intervention},
year = {2025},
}