Oral Abstract Session
SCMR 22nd Annual Scientific Sessions
Fully automated cardiac anatomy segmentation can empower the automated extraction of cardiac biomarkers such as ejection fraction, wall motion and thickness, myocardial mass and others from cine MR imaging. Deep learning has recently been shown to offer compelling solutions to this task. Yet, such progress needs a significant amount of labeling data to capture imaging and anatomical variability. We present a semi-supervised method that performs on par with fully supervised ones using as few as 6% of annotated data.
To allow us to use lots of unlabeled images, our method maps an input MR image into segmentation masks (one mask per anatomical label, e.g. myocardium, left ventricle etc.) but also maps masks to image using a reconstruction metric. Our model (Fig. 1) decomposes an input into masks and a vector that contains information needed for reconstruction. A reconstructor combines the masks and vector to recover the input image. We also employ discriminator losses (a non-linear form of estimating spatial statistics) to (a) improve reconstruction performance and (b) take advantage of a collection of segmentation masks (not related to the input data) and obtain a data-driven shape prior.
We train and evaluate using data from the ACDC challenge (see  for parameters) and a set of cine MRI images collected with a 3T Siemens scanner, of 28 healthy volunteers, each having a volume of 30 frames (spatial resolution: 1.406 mm2; slice thickness 10mm. We compare several approaches: fully supervised U-net ; U-net with discriminator (classic approach to semi-supervised learning); and the proposed semi-supervised method. In all experiments we use 1200 unlabeled MR images, 1200 available masks (for the discriminator), and variable number of image-mask pairs. For presentation simplicity we focus on the myocardium. As Figures 2 and 3 show, qualitatively and quantitatively, U-Net, when lacking adequate supervision, did not obtain satisfactory results. By adding a discriminator, performance improves, but at low percentages of supervision results are unreliable. Instead our approach shows consistent results even when trained with few examples.
As deep learning-based image analysis translates to the clinic the need to obtain great performance without significant supervision (provided by experts) will increase particularly due to the need of adapting to population differences. Here we show to obtain satisfactory results with minimal supervision on a common, and popular, cardiac analysis task. Yet we expect the largest impact will be in tasks where obtaining supervision is difficult such as atria segmentation.