Dr.

Ece Özkan Elsen

Established Researcher

E-Mail
ece.oezkanelsen@inf.ethz.ch
Address
Department of Computer Science
CAB G 33.2
Universitätstr. 6
CH – 8092 Zurich, Switzerland
Room
CAB G 33.2

I obtained my PhD degree in December 2018 from ETH Zurich. During my Master's Degree, I focused on image processing and machine learning methods using probabilistic models for efficient semantic segmentation and object detection. My doctoral research centered on developing methods to characterize the bio-mechanical properties of soft tissues using ultrasound.

From 2021 to 2022, I worked as a Postdoctoral Researcher in the Medical Data Science group at ETH and concurrently led the Network of Women in Computer Science (CSNOW). In 2023, I was a Postdoctoral Fellow at Department of Brain and Cognitive Sciences at MIT, working on human-inspired machine learning techniques for medical image analysis.

Returning to ETH in May 2024, I resumed my role within the Medical Data Science group as an Established Researcher.

My research interests lie in enhancing the generalization, explainability, and fairness of machine learning models to address medical challenges and interpret medical data.

You can find a video portrait of me here.
Abstract

Pulmonary hypertension (PH) in newborns and infants is a complex condition associated with several pulmonary, cardiac, and systemic diseases contributing to morbidity and mortality. Thus, accurate and early detection of PH and the classification of its severity is crucial for appropriate and successful management. Using echocardiography, the primary diagnostic tool in pediatrics, human assessment is both time-consuming and expertise-demanding, raising the need for an automated approach. Little effort has been directed towards automatic assessment of PH using echocardiography, and the few proposed methods only focus on binary PH classification on the adult population. In this work, we present an explainable multi-view video-based deep learning approach to predict and classify the severity of PH for a cohort of 270 newborns using echocardiograms. We use spatio-temporal convolutional architectures for the prediction of PH from each view, and aggregate the predictions of the different views using majority voting. Our results show a mean F1-score of 0.84 for severity prediction and 0.92 for binary detection using 10-fold cross-validation and 0.63 for severity prediction and 0.78 for binary detection on the held-out test set. We complement our predictions with saliency maps and show that the learned model focuses on clinically relevant cardiac structures, motivating its usage in clinical practice. To the best of our knowledge, this is the first work for an automated assessment of PH in newborns using echocardiograms.

Authors

Hanna Ragnarsdottir*, Ece Özkan Elsen*, Holger Michel*, Kieran Chin-Cheong, Laura Manduchi, Sven Wellmann, Julia E. Vogt
* denotes shared first authorship, denotes shared last authorship

Submitted

International Journal of Computer Vision

Date

06.02.2024

LinkDOI

Abstract

Appendicitis is among the most frequent reasons for pediatric abdominal surgeries. Previous decision support systems for appendicitis have focused on clinical, laboratory, scoring, and computed tomography data and have ignored abdominal ultrasound, despite its noninvasive nature and widespread availability. In this work, we present interpretable machine learning models for predicting the diagnosis, management and severity of suspected appendicitis using ultrasound images. Our approach utilizes concept bottleneck models (CBM) that facilitate interpretation and interaction with high-level concepts understandable to clinicians. Furthermore, we extend CBMs to prediction problems with multiple views and incomplete concept sets. Our models were trained on a dataset comprising 579 pediatric patients with 1709 ultrasound images accompanied by clinical and laboratory data. Results show that our proposed method enables clinicians to utilize a human-understandable and intervenable predictive model without compromising performance or requiring time-consuming image annotation when deployed. For predicting the diagnosis, the extended multiview CBM attained an AUROC of 0.80 and an AUPR of 0.92, performing comparably to similar black-box neural networks trained and tested on the same dataset.

Authors

Ricards Marcinkevics*, Patricia Reis Wolfertstetter*, Ugne Klimiene*, Kieran Chin-Cheong, Alyssia Paschke, Julia Zerres, Markus Denzinger, David Niederberger, Sven Wellmann, Ece Özkan Elsen, Christian Knorr, Julia E. Vogt
* denotes shared first authorship, denotes shared last authorship

Submitted

Medical Image Analysis

Date

01.01.2024

LinkDOICode

Abstract

Early detection of cardiac dysfunction through routine screening is vital for diagnosing cardiovascular diseases. An important metric of cardiac function is the left ventricular ejection fraction (EF), where lower EF is associated with cardiomyopathy. Echocardiography is a popular diagnostic tool in cardiology, with ultrasound being a low-cost, real-time, and non-ionizing technology. However, human assessment of echocardiograms for calculating EF is time-consuming and expertise-demanding, raising the need for an automated approach. In this work, we propose using the M(otion)-mode of echocardiograms for estimating the EF and classifying cardiomyopathy. We generate multiple artificial M-mode images from a single echocardiogram and combine them using off-the-shelf model architectures. Additionally, we extend contrastive learning (CL) to cardiac imaging to learn meaningful representations from exploiting structures in unlabeled data allowing the model to achieve high accuracy, even with limited annotations. Our experiments show that the supervised setting converges with only ten modes and is comparable to the baseline method while bypassing its cumbersome training process and being computationally much more efficient. Furthermore, CL using M-mode images is helpful for limited data scenarios, such as having labels for only 200 patients, which is common in medical applications.

Authors

Ece Özkan Elsen*, Thomas M. Sutter*, Yurong Hu, Sebastian Balzer, Julia E. Vogt
* denotes shared first authorship

Submitted

GCPR 2023

Date

01.09.2023

LinkCode

Abstract

Appendicitis is among the most frequent reasons for pediatric abdominal surgeries. With recent advances in machine learning, data-driven decision support could help clinicians diagnose and manage patients while reducing the number of non-critical surgeries. However, previous decision support systems for appendicitis have focused on clinical, laboratory, scoring, and computed tomography data and have ignored the use of abdominal ultrasound, despite its noninvasive nature and widespread availability. In this work, we present interpretable machine learning models for predicting the diagnosis, management and severity of suspected appendicitis using ultrasound images. To this end, our approach utilizes concept bottleneck models (CBM) that facilitate interpretation and interaction with high-level concepts that are understandable to clinicians. Furthermore, we extend CBMs to prediction problems with multiple views and incomplete concept sets. Our models were trained on a dataset comprising 579 pediatric patients with 1709 ultrasound images accompanied by clinical and laboratory data. Results show that our proposed method enables clinicians to utilize a human-understandable and intervenable predictive model without compromising performance or requiring time-consuming image annotation when deployed.

Authors

Ricards Marcinkevics*, Patricia Reis Wolfertstetter*, Ugne Klimiene*, Kieran Chin-Cheong, Alyssia Paschke, Julia Zerres, Markus Denzinger, David Niederberger, Sven Wellmann, Ece Özkan Elsen, Christian Knorr, Julia E. Vogt
* denotes shared first authorship, denotes shared last authorship

Submitted

Workshop on Machine Learning for Multimodal Healthcare Data, Co-located with ICML 2023

Date

29.07.2023

Abstract

Machine learning (ML) is a discipline emerging from computer science with close ties to statistics and applied mathematics. Its fundamental goal is the design of computer programs, or algorithms, that learn to perform a certain task in an automated manner. Without explicit rules or knowledge, ML algorithms observe and possibly, interact with the surrounding world by the use of available data. Typically, as a result of learning, algorithms distil observations of complex phenomena into a general model which summarises the patterns, or regularities, discovered from the data. Modern ML algorithms regularly break records achieving impressive performance at a wide range of tasks, e.g. game playing, protein structure prediction, searching for particles in high-energy physics, and forecasting precipitation. The utility of machine learning methods for healthcare is apparent: it is often argued that given vast amounts of heterogeneous data, our understanding of diseases, patient management and outcomes can be enriched with the insights from machine learning. In this chapter, we will provide a nontechnical introduction to the ML discipline aimed at a general audience with an affinity for biomedical applications. We will familiarise the reader with the common types of algorithms and typical tasks these algorithms can solve and illustrate these basic concepts by concrete examples of current machine learning applications in healthcare. We will conclude with a discussion of the open challenges, limitations, and potential impact of machine-learning-powered medicine.

Authors

Julia E. Vogt, Ece Özkan Elsen, Ricards Marcinkevics

Submitted

Chapter in Digital Medicine: Bringing Digital Solutions to Medical Practice

Date

31.03.2023

LinkDOI

Abstract

Many modern research fields increasingly rely on collecting and analysing massive, often unstructured, and unwieldy datasets. Consequently, there is growing interest in machine learning and artificial intelligence applications that can harness this `data deluge'. This broad nontechnical overview provides a gentle introduction to machine learning with a specific focus on medical and biological applications. We explain the common types of machine learning algorithms and typical tasks that can be solved, illustrating the basics with concrete examples from healthcare. Lastly, we provide an outlook on open challenges, limitations, and potential impacts of machine-learning-powered medicine.

Authors

Ricards Marcinkevics, Ece Özkan Elsen, Julia E. Vogt

Submitted

Arxiv

Date

23.12.2022

LinkDOI

Abstract

Early detection of cardiac dysfunction through routine screening is vital for diagnosing cardiovascular diseases. An important metric of cardiac function is the left ventricular ejection fraction (EF), which is used to diagnose cardiomyopathy. Echocardiography is a popular diagnostic tool in cardiology, with ultrasound being a low-cost, real-time, and non-ionizing technology. However, human assessment of echocardiograms for calculating EF is both time-consuming and expertise-demanding, raising the need for an automated approach. Earlier automated works have been limited to still images or use echocardiogram videos with spatio-temporal convolutions in a complex pipeline. In this work, we propose to generate images from readily available echocardiogram videos, each image mimicking a M(otion)-mode image from a different scan line through time. We then combine different M-mode images using off-the-shelf model architectures to estimate the EF and, thus, diagnose cardiomyopathy. Our experiments show that our proposed method converges with only ten modes and is comparable to the baseline method while bypassing its cumbersome training process.

Authors

Thomas Sutter, Sebastian Balzer, Ece Özkan Elsen, Julia E. Vogt

Submitted

Medical Imaging Meets NeurIPS Workshop 2022

Date

02.12.2022

Link

Abstract

Pulmonary hypertension (PH) in newborns and infants is a complex condition associated with several pulmonary, cardiac, and systemic diseases contributing to morbidity and mortality. Therefore, accurate and early detection of PH is crucial for successful management. Using echocardiography, the primary diagnostic tool in pediatrics, human assessment is both time-consuming and expertise-demanding, raising the need for an automated approach. In this work, we present an interpretable multi-view video-based deep learning approach to predict PH for a cohort of 194 newborns using echocardiograms. We use spatio-temporal convolutional architectures for the prediction of PH from each view, and aggregate the predictions of the different views using majority voting. To the best of our knowledge, this is the first work for an automated assessment of PH in newborns using echocardiograms. Our results show a mean F1-score of 0.84 for severity prediction and 0.92 for binary detection using 10-fold cross-validation. We complement our predictions with saliency maps and show that the learned model focuses on clinically relevant cardiac structures, motivating its usage in clinical practice.

Authors

Hanna Ragnarsdottir, Laura Manduchi, Holger Michel, Fabian Laumer, Sven Wellmann, Ece Özkan Elsen, Julia E. Vogt

Submitted

DAGM German Conference on Pattern Recognition

Date

20.09.2022

DOI

Abstract

Deep neural networks for image-based screening and computer-aided diagnosis have achieved expert-level performance on various medical imaging modalities, including chest radiographs. Recently, several works have indicated that these state-of-the-art classifiers can be biased with respect to sensitive patient attributes, such as race or gender, leading to growing concerns about demographic disparities and discrimination resulting from algorithmic and model-based decision-making in healthcare. Fair machine learning has focused on mitigating such biases against disadvantaged or marginalised groups, mainly concentrating on tabular data or natural images. This work presents two novel intra-processing techniques based on fine-tuning and pruning an already-trained neural network. These methods are simple yet effective and can be readily applied post hoc in a setting where the protected attribute is unknown during the model development and test time. In addition, we compare several intra- and post-processing approaches applied to debiasing deep chest X-ray classifiers. To the best of our knowledge, this is one of the first efforts studying debiasing methods on chest radiographs. Our results suggest that the considered approaches successfully mitigate biases in fully connected and convolutional neural networks offering stable performance under various settings. The discussed methods can help achieve group fairness of deep medical image classifiers when deploying them in domains with different fairness considerations and constraints.

Authors

Ricards Marcinkevics, Ece Özkan Elsen, Julia E. Vogt

Submitted

The Seventh Machine Learning for Healthcare Conference, MLHC 2022

Date

05.08.2022

LinkCode

Abstract

Arguably, interpretability is one of the guiding principles behind the development of machine-learning-based healthcare decision support tools and computer-aided diagnosis systems. There has been a renewed interest in interpretable classification based on high-level concepts, including, among other model classes, the re-exploration of concept bottleneck models. By their nature, medical diagnosis, patient management, and monitoring require the assessment of multiple views and modalities to form a holistic representation of the patient's state. For instance, in ultrasound imaging, a region of interest might be registered from multiple views that are informative about different sets of clinically relevant features. Motivated by this, we extend the classical concept bottleneck model to the multiview classification setting by representation fusion across the views. We apply our multiview concept bottleneck model to the dataset of ultrasound images acquired from a cohort of pediatric patients with suspected appendicitis to predict the disease. The results suggest that auxiliary supervision from the concepts and aggregation across multiple views help develop more accurate and interpretable classifiers.

Authors

Ugne Klimiene*, Ricards Marcinkevics*, Patricia Reis Wolfertstetter, Ece Özkan Elsen, Alyssia Paschke, David Niederberger, Sven Wellmann, Christian Knorr, Julia E Vogt
* denotes shared first authorship

Submitted

Oral spotlight at the 2nd Workshop on Interpretable Machine Learning in Healthcare (IMLH), ICML 2022

Date

23.07.2022

LinkCode

Abstract

Due to growing concerns about demographic disparities and discrimination resulting from algorithmic and model-based decision-making, recent research has focused on mitigating biases against already disadvantaged or marginalised groups in classification models. From the perspective of classification parity, the two commonest metrics for assessing fairness are statistical parity and equality of opportunity. Current approaches to debiasing in classification either require the knowledge of the protected attribute before or during training or are entirely agnostic to the model class and parameters. This work considers differentiable proxy functions for statistical parity and equality of opportunity and introduces two novel debiasing techniques for neural network classifiers based on fine-tuning and pruning an already-trained network. As opposed to the prior work leveraging adversarial training, the proposed methods are simple yet effective and can be readily applied post hoc. Our experimental results encouragingly suggest that these approaches successfully debias fully connected neural networks trained on tabular data and often outperform model-agnostic post-processing methods.

Authors

Ricards Marcinkevics, Ece Özkan Elsen, Julia E. Vogt

Submitted

Contributed talk at ICLR 2022 Workshop on Socially Responsible Machine Learning

Date

29.04.2022

LinkCode

Authors

Richard Rau, Ece Özkan Elsen, Batu M. Ozturkler, Leila Gastli, Orcun Goksel

Submitted

IEEE International Ultrasonics Symposium (IUS)

Date

11.08.2020

DOI

Authors

Lisa Ruby, Sergio J. Sanabria, Katharina Martini, Konstantin J. Dedes, Denise Vorburger, Ece Özkan Elsen, Thomas Frauenfelder, Orcun Goksel, Marga B. Rominger

Submitted

Investigative Radiology

Date

30.06.2019

DOI

Authors

Alvaro Gomariz, Weiye Li, Ece Özkan Elsen, Christine Tanner, Orcun Goksel

Submitted

International Symposium on Biomedical Imaging (ISBI)

Date

06.02.2019

DOI

Authors

Stefanie Ehrbar, Alexander Jöhl, Michael Kühni, Mirko Meboldt, Ece Özkan Elsen, Christine Tanner, Orcun Goksel, Stephan Klöck, Jan Unkelbach, Matthias Guckenberger, Stephanie Tanadini-Lang

Submitted

Medical Physics

Date

03.01.2019

DOI

Authors

Ece Özkan Elsen, Valery Vishnevsky, Orcun Goksel

Submitted

IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control

Date

03.03.2018

DOI

Authors

Ece Özkan Elsen, Christine Tanner, Matej Kastelic, Oliver Mattausch, Maxim Makhinya, Orcun Goksel

Submitted

International Journal of Computer Assisted Radiology and Surgery

Date

22.03.2017

DOI

Authors

Ece Özkan Elsen, Gemma Roig, Orcun Goksel, Xavier Boix

Submitted

arXiv

Date

27.05.2016

Authors

Firat Ozdemir, Ece Özkan Elsen, Orcun Goksel

Submitted

International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)

Date

27.05.2016

DOI

Authors

Ece Özkan Elsen, Orcun Goksel

Submitted

International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

Date

27.05.2015

DOI