The medical data science group carries out research at the intersection of machine learning and medicine with the ultimate goal of improving diagnosis and treatment outcome to the benefit of the care and wellbeing of patients. As medical and health data is heterogenous and multimodal, our research deals with the advancement of machine learning models and methodologies to address the specific challenges of the medical domain. Specifically, we work in the areas of multimodal data integration, structure detection, and trustworthy (or transparent) models. The challenge lies not only in developing fast, robust and reliable systems but also in systems that are easy to interpret and usable in clinical practice.


MDS at ICLR 2024

Several members of the MDS group attended ICLR 2024. Congratulations to everyone who presented work at the main conference and workshops!

Read more

Artificial intelligence detects heart defects in newborns

Our recent paper "The Deep Learning Based Prediction of Pulmonary Hypertension in Newborns Using Echocardiograms", published together with KUNO Klinik…

Read more

Thomas and Imant defend PhD thesis in 2023

Congratulations to Thomas Sutter and Imant Daunhawer, who both successfully defended their PhD Theses in 2023.

Thomas' thesis is titled "Imposing and…

Read more


Abstract

Despite significant progress, evaluation of explainable artificial intelligence remains elusive and challenging. In this paper we propose a fine-grained validation framework that is not overly reliant on any one facet of these sociotechnical systems, and that recognises their inherent modular structure: technical building blocks, user-facing explanatory artefacts and social communication protocols. While we concur that user studies are invaluable in assessing the quality and effectiveness of explanation presentation and delivery strategies from the explainees' perspective in a particular deployment context, the underlying explanation generation mechanisms require a separate, predominantly algorithmic validation strategy that accounts for the technical and human-centred desiderata of their (numerical) outputs. Such a comprehensive sociotechnical utility-based evaluation framework could allow to systematically reason about the properties and downstream influence of different building blocks from which explainable artificial intelligence systems are composed – accounting for a diverse range of their engineering and social aspects – in view of the anticipated use case.

Authors

Kacper Sokol, Julia E. Vogt

Submitted

Extended Abstracts of the 2024 ACM Conference on Human Factors in Computing Systems (CHI)

Date

02.05.2024

LinkDOI

Abstract

High-density multielectrode catheters are becoming increasingly popular in cardiac electrophysiology for advanced characterisation of the cardiac tissue, dueto their potential to identify impaired sites. These are often characterised by abnormal electrical conduction, which may cause locally disorganised propagation wavefronts.To quantify it, a novel heterogeneity parameter based on vector field analysis is proposed, utilising finite differences to measure direction changes between adjacent cliques. The proposed Vector Field Heterogeneity metric has been evaluated on a set of simulations with controlled levels of organisation in vector maps, and a variety of grid sizes. Furthermore, it has been tested on animal experimental models of isolated Langendorff-perfused rabbit hearts. The proposed parameter exhibited superior capturing ability of heterogeneous propagation wavefronts compared to the classical Spatial Inhomogeneity Index, and simulations proved that the metric effectively captures gradual increments in disorganisation in propagation patterns. Notably, it yielded robust and consistent outcomes for 4 × 4 grid sizes, underscoring its suitability for the latest generation of orientation-independent cardiac catheters. Index Terms—Animal experimental models, cardiac signal processing, electrophysiology, high-density electrode catheters, vector field heterogeneity. Impact Statement—The authors introduce the Vector Field Heterogeneity (VFH) metric, which provides a precise evaluation of disorganisation in electrical propagation maps within cardiac tissue, potentially improving the diagnosis and characterisation of electrophysiological conditions.

Authors

L Pancorbo*, S Ruipérez-Campillo*, A Tormos, A Guill, R Cervigón, A Alberola, FJ Chorro, J Millet, F Castells
* denotes shared first authorship

Submitted

IEEE Open Journal of Engineering in Medicine and Biology

Date

23.02.2024

LinkDOI

Abstract

Pulmonary hypertension (PH) in newborns and infants is a complex condition associated with several pulmonary, cardiac, and systemic diseases contributing to morbidity and mortality. Thus, accurate and early detection of PH and the classification of its severity is crucial for appropriate and successful management. Using echocardiography, the primary diagnostic tool in pediatrics, human assessment is both time-consuming and expertise-demanding, raising the need for an automated approach. Little effort has been directed towards automatic assessment of PH using echocardiography, and the few proposed methods only focus on binary PH classification on the adult population. In this work, we present an explainable multi-view video-based deep learning approach to predict and classify the severity of PH for a cohort of 270 newborns using echocardiograms. We use spatio-temporal convolutional architectures for the prediction of PH from each view, and aggregate the predictions of the different views using majority voting. Our results show a mean F1-score of 0.84 for severity prediction and 0.92 for binary detection using 10-fold cross-validation and 0.63 for severity prediction and 0.78 for binary detection on the held-out test set. We complement our predictions with saliency maps and show that the learned model focuses on clinically relevant cardiac structures, motivating its usage in clinical practice. To the best of our knowledge, this is the first work for an automated assessment of PH in newborns using echocardiograms.

Authors

Hanna Ragnarsdottir*, Ece Özkan Elsen*, Holger Michel*, Kieran Chin-Cheong, Laura Manduchi, Sven Wellmann, Julia E. Vogt
* denotes shared first authorship, denotes shared last authorship

Submitted

International Journal of Computer Vision

Date

06.02.2024

LinkDOI

Abstract

Recently, interpretable machine learning has re-explored concept bottleneck models (CBM), comprising step-by-step prediction of the high-level concepts from the raw features and the target variable from the predicted concepts. A compelling advantage of this model class is the user's ability to intervene on the predicted concept values, affecting the model's downstream output. In this work, we introduce a method to perform such concept-based interventions on already-trained neural networks, which are not interpretable by design, given an annotated validation set. Furthermore, we formalise the model's intervenability as a measure of the effectiveness of concept-based interventions and leverage this definition to fine-tune black-box models. Empirically, we explore the intervenability of black-box classifiers on synthetic tabular and natural image benchmarks. We demonstrate that fine-tuning improves intervention effectiveness and often yields better-calibrated predictions. To showcase the practical utility of the proposed techniques, we apply them to deep chest X-ray classifiers and show that fine-tuned black boxes can be as intervenable and more performant than CBMs.

Authors

Sonia Laguna*, Ricards Marcinkevics*, Moritz Vandenhirtz, Julia E. Vogt
* denotes shared first authorship

Submitted

Arxiv

Date

24.01.2024

Link

Abstract

Background and Objectives: The extensive collection of electrocardiogram (ECG) recordings stored in paper format has provided opportunities for numerous digitization studies. However, the traditional 10 s 12-lead ECG printout typically splits the ECG signals into four asynchronous sections of 3 leads and 2.5 s each. Since each lead corresponds to different time instants, developing a synchronization method becomes necessary for applications such as vectorcardiogram (VCG) reconstruction. Methods: A beat-level synchronization method has been developed and validated using a dataset of 21,674 signals. This method effectively addresses synchronization distortions caused by RR interval variations and preserves the time lags between R peaks across different leads for each beat. Results: The results demonstrate that the proposed method successfully synchronizes the ECG, allowing a VCG reconstruction with an average Pearson Correlation Coefficient of 0.9815±0.0426. The Normalized Root Mean Squared Error (NRMSE) and Mean Absolute Error (MAE) values for the reconstructed VCG are 0.0248±0.0214 mV and 0.0133±0.0123 mV, respectively. These metrics indicate the reliability of the VCG reconstruction achieved by means of the proposed synchronization method. Conclusions: The synchronization method has demonstrated its robustness and high performance compared to existing techniques in the field. Its effectiveness has been observed across a wide variety of signals, showcasing its applicability in real clinical environments. Moreover, its ability to handle a large number of signals makes it suitable for various applications, including retrospective studies and the development of machine learning methods.

Authors

E Ramírez, S Ruipérez-Campillo, F Castells, R Casado-Arroyo, J Millet

Submitted

Biomedical Signal Processing and Control

Date

05.01.2024

LinkDOI