Segmentation of medical images and time series using fully convolutional neural networks

Publikation: Bog/antologi/afhandling/rapportPh.d.-afhandlingForskning

Diagnostic tasks in healthcare often involve segmenting regions of interest in images and time series, such as outlining organs in medical scans or scoring physiological events in electroencephalography (EEG) recordings. Medical professionals perform most of these complex and time-consuming tasks manually, leading to potential errors and limiting diagnostic efficiency. With the increasing global diagnostic burden on healthcare systems, there is a growing need for (semi-) automatic computer systems to alleviate repetitive manual tasks. Furthermore, these systems can make expert knowledge available for people with limited access to well-trained medical doctors.
The primary aim of this thesis was to develop clinically robust automatic segmentation systems for medical images and time series based on recent advances in machine learning. The thesis comprises two parts.
The first part focused on developing a machine learning model for general medical 3D image segmentation, applicable across scanning modalities and tasks. We introduced the Multi-Planar U-Net, a fully convolutional neural network based on the U-Net architecture, which uses a data-augmentation scheme to resample randomly rotated 2D input images from 3D training data. This process enforces rotational equivariance properties and enables segmenting new scans from multiple orientations for ensemble-like predictions. The Multi-Planar U-Net demonstrated applicability to variable tasks in magnetic resonance (MR) and computerized tomography (CT) images without manual hyperparam-
eter adjustments and proved competitive in multiple segmentation challenges, including the 2018 Medical Segmentation Decathlon and the 2020 International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge.
The second part of this thesis addressed the problem of automatic sleep staging of polysomnographic data, which involves segmenting physiological signals from sleeping individuals into distinct sleep stages. We developed a U-Net-based model for medical time series data called U-Time that leverages the similarities between sleep staging and image segmentation. This model outperformed alternative models typically used for automatic sleep staging and was transferable across clinical cohorts without hyperparameter re-tuning.
This ability inspired the development of a successor model, U-Sleep, designed for robust sleep staging on diverse polysomnography data. Trained on over 15,000 participants from 12 clinical studies, U-Sleep demonstrated expert-level accuracy and adaptability to different EEG input derivations, patient demographics, and recording equipment. It was also accurate for patients with severe brain disorders, such as stroke and Parkinson’s disease, despite their absence in the training dataset. Furthermore, we explored U-Sleep’s ability to score sleep stages at higher-than-usual frequencies, which facilitated the separation of patients with sleep disorders or acute stroke from control groups, indicating potential biomarker development. Sleep metrics derived from U-Sleep’s high-frequency sleep scores were more consistent than those from low-frequency and human expert scores, suggesting improved diagnostic accuracy. Consequently, U-Sleep may be a candidate for clinical sleep staging and a potential research tool for high-frequency sleep patterns. The model is available for research at https://sleep.ai.ku.dk/, where it has scored over 45,000 sleep studies.
In summary, this thesis presented clinically robust and accurate machine learning models for segmenting medical image volumes and time series. Key practices for developing such models were identified: First, we reconfirmed that fully convolutional, feed-forward-only neural networks like the U-Net are broadly applicable as they performed well across diverse tasks in medical images and time series. Second, we found it beneficial to design data-augmentation techniques that induce various model invariance or equivariance properties to input data transformations that increase clinical robustness, even if the target function becomes more complex, as long as the augmentations also significantly expand the set of actual training examples. Finally, we found clinical robustness achievable by training machine learning models on extensive and highly variable training datasets from multiple sources, even if datasets differ in recording hardware, patient population, or data preprocessing pipeline.
OriginalsprogEngelsk
ForlagDepartment of Computer Science, Faculty of Science, University of Copenhagen
Antal sider316
StatusUdgivet - 2023

ID: 380301873