The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA

Zien Ma, S. M. Shermer, Oktay Karakuş, Frank C. Langbein. The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA. Preprint, February 2026. [PDF] [arXiv:2602.20289]

Systematic Bayesian model selection of deep learning models on realistic simulations
Phantom ground-truth validation across solution and gel series at 3 T
Non-augmented models show a sim-to-real gap and are comparable to LCModel
Linewidth-augmented DL outperforms LCModel for GABA and Glu on phantoms; gap remains

Magnetic resonance spectroscopy (MRS) is used to quantify metabolites in vivo and estimate biomarkers for conditions ranging from neurological disorders to cancers. Quantifying low-concentration metabolites such as GABA (γ-aminobutyric acid) is challenging due to low signal-to-noise ratio (SNR) and spectral overlap. We investigate and validate deep learning for quantifying complex, low-SNR, overlapping signals from MEGA-PRESS spectra, devise a convolutional neural network (CNN) and a Y-shaped autoencoder (YAE), and select the best models via Bayesian optimisation on 10,000 simulated spectra from slice-profile-aware MEGA-PRESS simulations. The selected models are trained on 100,000 simulated spectra. We validate their performance on 144 spectra from 112 experimental phantoms containing five metabolites of interest (GABA, Glu, Gln, NAA, Cr) with known ground truth concentrations across solution and gel series acquired at 3 T under varied bandwidths and implementations. These models are further assessed against the widely used LCModel quantification tool. On simulations, both models achieve near-perfect agreement (small MAEs; regression slopes ≈ 1.00, R² ≈ 1.00). On experimental phantom data, errors initially increased substantially. However, modelling variable linewidths in the training data significantly reduced this gap. The best augmented deep learning models achieved a mean MAE for GABA over all phantom spectra of 0.151 (YAE) and 0.160 (FCNN) in max-normalised relative concentrations, outperforming the conventional baseline LCModel (0.220). A sim-to-real gap remains, but physics-informed data augmentation substantially reduced it. Phantom ground truth is needed to judge whether a method will perform reliably on real data.

Cite this page as 'Frank C Langbein, "The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA," Ex Tenebris Scientia, 23rd February 2026, https://langbein.org/mrsnetae/ [accessed 1st April 2026]'.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Frank C Langbein
Ex Tenebris Scientia

Frank C Langbein
Ex Tenebris Scientia

The Sim-to-Real Gap in MRS Quantification: A Systematic Deep Learning Validation for GABA