A 3D fully convolutional neural network and a random walker to segment the esophagus in CT

A 3D fully convolutional neural network and a random walker to segment the esophagus in CT

Fechter, Tobias and Adebahr, Sonja and Baltas, Dimos and Ayed, Ismail Ben and Desrosiers, Christian and Dolz, Jose

arXiv 2017

Abstract : Purpose: Precise delineation of organs at risk is a crucial task in radiotherapy treatment planning when aiming at delivering high dose to the tumour while sparing healthy tissues. In recent years algorithms showed high performance and the possibility to automate this task for many organs. However, for some organs precise delineation remains challenging, even for human experts. One of them is the esophagus with a versatile shape and poor contrast to neighboring tissue. To tackle these issues we propose a 3D fully CNN (convolutional neural network) driven random walk approach to automatically segment the esophagus on CT images. Methods: First, a soft probability map is generated by the CNN. Then an active contour model (ACM) is fitted on the CNN soft probability map to get a first estimation of the esophagus location. The outputs of the CNN and ACM are then used in addition to a probability model based on CT Hounsfield (HU) values to drive the random walker. Evaluation and training was done on two different datasets, with a total of 50 CTs with clinically used peer reviewed esophagus contours. Results were assessed regarding spatial overlap and shape similarities. Results: The esophagus contours, generated by the proposed algorithm showed a mean Dice coefficient of 0.76 ± 0.11, an average symmetric square distance of 1.36 ± 0.90 mm and an average Hausdorff distance of 11.68 ± 6.80 compared to the reference contours. These figures translate into a very good agreement with the reference contours and an increase in accuracy compared to other methods. Conclusion: We show that by employing a CNN accurate estimations of esophagus location can be obtained and refined by a post processing random walk step taking pixel intensities and neighborhood relationships into account. One of the main advantages compared to previous methods is that our network performs convolutions in a 3D manner, fully exploiting the 3D spatial context and performing an efficient and precise volume-wise prediction. The whole segmentation process is fully automatic and yields esophagus delineations in very good agreement with the used gold standard, showing that it can compete with previously published methods. The results demonstrate the feasibility of our approach employing a CNN to drive a random walker for esophagus segmentation.