The medical data science group carries out research at the intersection of machine learning and medicine with the ultimate goal of improving diagnosis and treatment outcome to the benefit of the care and wellbeing of patients. As medical and health data is heterogenous and multimodal, our research deals with the advancement of machine learning models and methodologies to address the specific challenges of the medical domain. Specifically, we work in the areas of multimodal data integration, structure detection, and trustworthy (or transparent) models. The challenge lies not only in developing fast, robust and reliable systems but also in systems that are easy to interpret and usable in clinical practice.
Ricards Marcinkevics receives ABB Research Prize
Congratulations to Ricards Marcinkevics on receiving the 2025 ABB Research Prize, which was presented at the 2025 ETH Day, for his doctoral thesis "Ex…
New Timeline Documents 30+ Years of Promoting Women in Computer Science at D-INFK
The Department of Computer Science (D-INFK) at ETH Zurich has published a new historical timeline documenting the development of its women’s promotion…
Dr Ece Özkan Elsen appointed as BRCCH Professor of Paediatric Digital Health Data Analysis
We are excited to announce that Dr. Ece Ozkan Elsen, currently an Established Researcher in our group, will be transitioning to her new role as…
The HD-Grid catheter is widely in clinical practice for intracavitary electrical mapping. However, its interelectrode spacing does not always ensure compliance with the assumptions of waveform homogeneity, amplitude consistency, and planar wavefront propagation required by the traveling wave theory underlying omnipolar reconstruction. In this study we aim to quantify the extend to which these assumptions hold by introducing the Amplitude Variability, Morphology, and Non-Planarity parameters. Additionally, we propose a solution to this limitation through unipolar signal interpolation to increase virtual spatial resolution and ensure accurate omnipolar signal reconstruction and biomarkers extraction. Out method employs a linear combination of unipolar HD-Grid signals, weighted using spline interpolation the inverse of the distance. Results indicate that compliance with the theoretical assumptions is influenced by interelectrode distance, with optimal adherence achieved at electrode spacings of 0.5 mm, ensuring remains within a 5% tolerance across all parameters. The proposed method improves adherence to the theoretical assumptions, enabling more reliable omnipolar signal reconstruction, thereby enhancing the characterization of intracardiac propagation patterns towards a more accurate localization of ablation targets.Clinical Relevance- Non-adherence to the theoretical assumptions in omnipolar technology can lead to inaccurate characterization of cardiac propagation patterns. The proposed method enhances compliance with these assumptions, yielding a more accurate representation of the omnipolar signal.
AuthorsElisa Ramirez, Johanna Tonko, Raul Alos, Samuel Ruipérez-Campillo, Piere Lambiase, Jose Millet, Francisco Castells
SubmittedIEEE Engineering in Medicine & Biology Society (47th EMBC, 2025)
Date03.12.2025
Intracardiac electrophysiological (EP) signals are frequently contaminated by diverse noise sources, posing a major obstacle to accurate arrhythmia diagnosis. We hypothesized that a physics-inspired conditional denoising diffusion probabilistic model (cDDPM) could outperform both classical filters and variational autoencoders by preserving subtle morphological features. Using 5706 monophasic action potentials from 42 patients, we introduced a range of simulated and real EP noise, then trained the cDDPM in an iterative process analogous to Brownian motion. The proposed model achieved superior performance across RMSE, PCC, and PSNR metrics, confirming its robustness against complex noise while maintaining essential signal fidelity. These findings suggest that diffusion-based methods can significantly enhance the clinical utility of EP signals for arrhythmia management and intervention. Clinical Relevance— We propose a denoising diffusion probabilistic model to reconstruct intracardiac signals in the presence of complex noise, which holds the potential to enhance diagnostic accuracy in EP procedures and inform more targeted treatment strategies.
AuthorsSamuel Ruipérez-Campillo, Moritz Rau, Prasanth Ganesan, Kelly A Brennan, Ruibin Feng, Sabyasachi Bandyopadhyay, Albert J Rogers, Sanjiv M Narayan, Julia E Vogt
SubmittedIEEE Engineering in Medicine & Biology Society (47th EMBC, 2025)
Date03.12.2025
Reducing electrophysiological (EP) signal noise is essential for diagnosis, mapping, and ablation procedures in patients with arrhythmias or conditions such as cardiomyopathies. However, traditional approaches have been suboptimal due to the varied sources of noise. We hypothesized that variational autoencoders (VAEs) can learn key components of ’clean’ electrophysiological signals by creating robust internal representations, thereby enabling automatic denoising of diverse noise in clinical recordings. We set out to apply a β-VAE model to a dataset of 5706 intra-ventricular monophasic action potential (MAP) signals, selected because their morphology is verifiable and measurable against a reference, from 42 patients with ischemic cardiomyopathy at risk for sudden death. We designed a noise library, and implemented baselines based on state-of-the-art clinical filtering techniques. The proposed β-VAE model was assessed for various noise types, including challenging non-stationary real EP noise. Comprehensive evaluation using general metrics and clinical action potential duration labels by domain experts revealed that our β-VAE outperformed current state-of-the-art filters in denoising efficacy, with key physiological information encoded in the reconstruction. We performed a sensitivity analysis that confirmed the robustness of the β-VAE model to increasing noise levels. These results demonstrate the ability of our model to denoise various sources, including those of time-varying nature. The application to well-studied MAPs verifies that clinically meaningful features were reconstructed in the EP context. This work enhances traditional signal processing approaches to ensure ’clean’ electrical signals, and may have promising applications for diagnosis, tracking therapy and prognostication in patients with EP disorders in real-world clinical environments.
AuthorsSamuel Ruipérez-Campillo, Alain Ryser, Thomas M Sutter, Brototo Deb, Ruibin Feng, Prasanth Ganesan, Kelly A Brennan, Albert J Rogers, Maarten ZH Kolk, Fleur VY Tjong, Sanjiv M Narayan†, Julia E Vogt†† denotes shared last authorship
SubmittedExpert Systems with Applications
Date05.11.2025
Background: Timely and accurate detection of arrhythmias from electrocardiograms (ECGs) is crucial for improving patient outcomes. While artificial intelligence (AI)-based ECG classification has shown promising results, limited transparency and interpretability often impede clinical adoption. Methods: We present ECG-XPLAIM, a novel deep learning model dedicated to ECG classification that employs a one-dimensional inception-style convolutional architecture to capture local waveform features (e.g., waves and intervals) and global rhythm patterns. To enhance interpretability, we integrate Grad-CAM visualization, highlighting key waveform segments that drive the model's predictions. ECG-XPLAIM was trained on the MIMIC-IV dataset and externally validated on PTB-XL for multiple arrhythmias, including atrial fibrillation (AFib), sinus tachycardia (STach), conduction disturbances (RBBB, LBBB, LAFB), long QT (LQT), Wolff-Parkinson-White (WPW) pattern, and paced rhythm detection. We evaluated performance using sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC), and benchmarked against a simplified convolutional neural network, a two-layer gated recurrent unit (GRU), and an external, pre-trained, ResNet-based model. Results: Internally (MIMIC-IV), ECG-XPLAIM achieved high diagnostic performance (sensitivity, specificity, AUROC > 0.9) across most tasks. External evaluation (PTB-XL) confirmed generalizability, with metric values exceeding 0.95 for AFib and STach. For conduction disturbances, macro-averaged sensitivity reached 0.90, specificity 0.95, and AUROC 0.98. Performance for LQT, WPW, and pacing rhythm detection was 0.691/0.864/0.878, 0.773/0.973/0.895, and 0.96/0.988/0.993 (sensitivity/specificity/AUROC), respectively. Compared to baseline models, ECG-XPLAIM offered superior performance across most tests, and improved sensitivity over the external ResNet-based model, albeit at the cost of specificity. Grad-CAM revealed physiologically relevant ECG segments influencing predictions and highlighted patterns of potential misclassification. Conclusion: ECG-XPLAIM combines high diagnostic performance with interpretability, addressing a key limitation in AI-driven ECG analysis. The open-source release of ECG-XPLAIM's architecture and pre-trained weights encourages broader adoption, external validation, and further refinement for diverse clinical applications.
AuthorsPanteleimon Pantelidis*, Samuel Ruipérez-Campillo*, Julia E Vogt, Alexios Antonopoulos, Ioannis Gialamas, George E Zakynthinos, Michael Spartalis, Polychronis Dilaveris, Jose Millet, Panagiotis Papapetrou, Theodore G Papaioannou, Evangelos Oikonomou, Gerasimos Siasos* denotes shared first authorship
SubmittedFrontiers in Cardiovascular Medicine
Date16.10.2025
We present RadVLM, a compact (7B) multitask conversational foundation model designed for CXR interpretation. Its development relies on the curation of a large-scale instruction dataset comprising over 1 million image-instruction pairs containing both single-turn tasks - such as report generation, abnormality classification, and visual grounding - and multi-turn, multi-task conversational interactions. Our experiments show that RadVLM, fine-tuned on this instruction dataset, achieves state-of-the-art performance in conversational capabilities and visual grounding while remaining competitive in other radiology tasks (report generation, classification). Ablation studies further highlight the benefit of joint training across multiple tasks, particularly for scenarios with limited annotated data. Together, these findings highlight the potential of the RadVLM model as a clinically relevant AI assistant, providing structured CXR interpretation and conversational capabilities to support more effective and accessible diagnostic workflows.
AuthorsNicolas Deperrois, Hidetoshi Matsuo, Samuel Ruipérez-Campillo, Moritz Vandenhirtz, Sonia Laguna, Alain Ryser, Koji Fujimoto, Mizuho Nishio, Thomas Sutter, Julia Vogt, Jonas Kluckert, Thomas Frauenfelder, Christian Bluethgen, Farhad Nooralahzadeh, Michael Krauthammer
SubmittedPhysionet
Date08.10.2025


