Dr.

Laura Manduchi

Alumni

E-Mail: laura.manduchi@inf.ethz.ch

I am interested in representation learning, probabilistic modelling, clustering and deep learning, in particular I am keen in applying machine learning methods to tackle medical problems and discover new relationships in the medical data.

I did my undergraduate studies in Information Engineering at the University of Padua, Italy, where I worked with Prof. Dr. Fabio Vandin on the Optimization of Fast Westfall-Young algorithm for mining significant patterns. I further obtained a M.Sc. in Data Science at ETH Zürich, where I acquired a strong background in Machine Learning. My Master’s thesis project under the supervision of Prof. Dr. Gunnar Rätsch was focused on the intersection between clustering and representation learning. After that, I did a research internship at the European Space Agency where I had the opportunity to apply state-of-the-art Machine Learning methods in astrophysics. In February 2020 I joined the Medical Data Science lab lead by Prof. Dr. Julia Vogt at ETH as a PhD student.

Further information about my research activities can be found here.

Jorge Silva Gonçalves, Laura Manduchi, Moritz Vandenhirtz, Julia E VogtTreeDiffusion: Hierarchical Generative Clustering for Conditional DiffusionJoint European Conference on Machine Learning and Knowledge Discovery in Databases

Abstract

Generative modeling and clustering are conventionally distinct tasks in machine learning. Variational Autoencoders (VAEs) have been widely explored for their ability to integrate both, providing a framework for generative clustering. However, while VAEs can learn meaningful cluster representations in latent space, they often struggle to generate high-quality samples. This paper addresses this problem by introducing TreeDiffusion, a deep generative model that conditions diffusion models on learned latent hierarchical cluster representations from a VAE to obtain high-quality, cluster-specific generations. Our approach consists of two steps: first, a VAE-based clustering model learns a hierarchical latent representation of the data. Second, a cluster-aware diffusion model generates realistic images conditioned on the learned hierarchical structure. We systematically compare the generative capabilities of our approach with those of alternative conditioning strategies. Empirically, we demonstrate that conditioning diffusion models on hierarchical cluster representations improves the generative performance on real-world datasets compared to other approaches. Moreover, a key strength of our method lies in its ability to generate images that are both representative and specific to each cluster, enabling more detailed visualization of the learned latent structure. Our approach addresses the generative limitations of VAE-based clustering approaches by leveraging their learned structure, thereby advancing the field of generative clustering.

Authors

Jorge Silva Gonçalves, Laura Manduchi, Moritz Vandenhirtz, Julia E Vogt

Submitted

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

Date

04.10.2025

Link DOI Code

Jorge da Silva Goncalves, Laura Manduchi, Moritz Vandenhirtz, Julia E. VogtHierarchical Clustering for Conditional Diffusion in Image GenerationarXiv

Abstract

Finding clusters of data points with similar characteristics and generating new cluster-specific samples can significantly enhance our understanding of complex data distributions. While clustering has been widely explored using Variational Autoencoders, these models often lack generation quality in real-world datasets. This paper addresses this gap by introducing TreeDiffusion, a deep generative model that conditions Diffusion Models on hierarchical clusters to obtain high-quality, cluster-specific generations. The proposed pipeline consists of two steps: a VAE-based clustering model that learns the hierarchical structure of the data, and a conditional diffusion model that generates realistic images for each cluster. We propose this two-stage process to ensure that the generated samples remain representative of their respective clusters and enhance image fidelity to the level of diffusion models. A key strength of our method is its ability to create images for each cluster, providing better visualization of the learned representations by the clustering model, as demonstrated through qualitative results. This method effectively addresses the generative limitations of VAE-based approaches while preserving their clustering performance. Empirically, we demonstrate that conditioning diffusion models on hierarchical clusters significantly enhances generative performance, thereby advancing the state of generative clustering models.

Authors

Jorge da Silva Goncalves, Laura Manduchi, Moritz Vandenhirtz, Julia E. Vogt

Submitted

arXiv

Date

22.10.2024

Link DOI

Jorge da Silva Gonçalves, Laura Manduchi, Moritz Vandenhirtz, Julia E. VogtStructured Generations: Using Hierarchical Clusters to guide Diffusion ModelsICML 2024 Workshop on Structured Probabilistic Inference & Generative Modeling

Abstract

This paper introduces Diffuse-TreeVAE, a deep generative model that integrates hierarchical clustering into the framework of Denoising Diffusion Probabilistic Models (DDPMs). The proposed approach generates new images by sampling from a root embedding of a learned latent tree VAE-based structure, it then propagates through hierarchical paths, and utilizes a second-stage DDPM to refine and generate distinct, high-quality images for each data cluster. The result is a model that not only improves image clarity but also ensures that the generated samples are representative of their respective clusters, addressing the limitations of previous VAE-based methods and advancing the state of clustering-based generative modeling.

Authors

Jorge da Silva Gonçalves, Laura Manduchi, Moritz Vandenhirtz, Julia E. Vogt

Submitted

ICML 2024 Workshop on Structured Probabilistic Inference & Generative Modeling

Date

27.07.2024

Link

Emanuele Palumbo, Laura Manduchi, Sonia Laguna, Daphne Chopard, Julia E VogtDeep generative clustering with multimodal diffusion variational autoencodersICLR: The Twelfth International Conference on Learning Representations

Abstract

Multimodal VAEs have recently gained significant attention as generative models for weakly-supervised learning with multiple heterogeneous modalities. In parallel, VAE-based methods have been explored as probabilistic approaches for clustering tasks. At the intersection of these two research directions, we propose a novel multimodal VAE model in which the latent space is extended to learn data clusters, leveraging shared information across modalities. Our experiments show that our proposed model improves generative performance over existing multimodal VAEs, particularly for unconditional generation. Furthermore, we propose a post-hoc procedure to automatically select the number of true clusters thus mitigating critical limitations of previous clustering frameworks. Notably, our method favorably compares to alternative clustering approaches, in weakly-supervised settings. Finally, we integrate recent advancements in diffusion models into the proposed method to improve generative quality for real-world images.

Authors

Emanuele Palumbo, Laura Manduchi, Sonia Laguna, Daphne Chopard, Julia E Vogt

Submitted

ICLR: The Twelfth International Conference on Learning Representations

Date

17.05.2024

Link Code

Hanna Ragnarsdottir^, Ece Özkan Elsen^, Holger Michel^, Kieran Chin-Cheong, Laura Manduchi, Sven Wellmann^†, Julia E. Vogt^†
^ denotes shared first authorship, ^† denotes shared last authorshipDeep Learning Based Prediction of Pulmonary Hypertension in Newborns Using EchocardiogramsInternational Journal of Computer Vision

Abstract

Pulmonary hypertension (PH) in newborns and infants is a complex condition associated with several pulmonary, cardiac, and systemic diseases contributing to morbidity and mortality. Thus, accurate and early detection of PH and the classification of its severity is crucial for appropriate and successful management. Using echocardiography, the primary diagnostic tool in pediatrics, human assessment is both time-consuming and expertise-demanding, raising the need for an automated approach. Little effort has been directed towards automatic assessment of PH using echocardiography, and the few proposed methods only focus on binary PH classification on the adult population. In this work, we present an explainable multi-view video-based deep learning approach to predict and classify the severity of PH for a cohort of 270 newborns using echocardiograms. We use spatio-temporal convolutional architectures for the prediction of PH from each view, and aggregate the predictions of the different views using majority voting. Our results show a mean F1-score of 0.84 for severity prediction and 0.92 for binary detection using 10-fold cross-validation and 0.63 for severity prediction and 0.78 for binary detection on the held-out test set. We complement our predictions with saliency maps and show that the learned model focuses on clinically relevant cardiac structures, motivating its usage in clinical practice. To the best of our knowledge, this is the first work for an automated assessment of PH in newborns using echocardiograms.

Authors

Hanna Ragnarsdottir^*, Ece Özkan Elsen^*, Holger Michel^*, Kieran Chin-Cheong, Laura Manduchi, Sven Wellmann^†, Julia E. Vogt^†
^* denotes shared first authorship, ^† denotes shared last authorship

Submitted

International Journal of Computer Vision

Date

06.02.2024

Link DOI

Laura Manduchi^, Moritz Vandenhirtz^, Alain Ryser, Julia E. Vogt
^* denotes shared first authorshipTree Variational AutoencodersSpotlight at Neural Information Processing Systems, NeurIPS 2023

Abstract

We propose Tree Variational Autoencoder (TreeVAE), a new generative hierarchical clustering model that learns a flexible tree-based posterior distribution over latent variables. TreeVAE hierarchically divides samples according to their intrinsic characteristics, shedding light on hidden structures in the data. It adapts its architecture to discover the optimal tree for encoding dependencies between latent variables. The proposed tree-based generative architecture enables lightweight conditional inference and improves generative performance by utilizing specialized leaf decoders. We show that TreeVAE uncovers underlying clusters in the data and finds meaningful hierarchical relations between the different groups on a variety of datasets, including real-world imaging data. We present empirically that TreeVAE provides a more competitive log-likelihood lower bound than the sequential counterparts. Finally, due to its generative nature, TreeVAE is able to generate new samples from the discovered clusters via conditional sampling.

Authors

Laura Manduchi^*, Moritz Vandenhirtz^*, Alain Ryser, Julia E. Vogt
^* denotes shared first authorship

Submitted

Spotlight at Neural Information Processing Systems, NeurIPS 2023

Date

20.12.2023

Link Code

Laura Manduchi^, Moritz Vandenhirtz^, Alain Ryser, Julia E. Vogt
^* denotes shared first authorshipTree Variational AutoencodersICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling

Abstract

We propose a new generative hierarchical clustering model that learns a flexible tree-based posterior distribution over latent variables. The proposed Tree Variational Autoencoder (TreeVAE) hierarchically divides samples according to their intrinsic characteristics, shedding light on hidden structures in the data. It adapts its architecture to discover the optimal tree for encoding dependencies between latent variables, improving generative performance. We show that TreeVAE uncovers underlying clusters in the data and finds meaningful hierarchical relations between the different groups on several datasets. Due to its generative nature, TreeVAE can generate new samples from the discovered clusters via conditional sampling.

Authors

Laura Manduchi^*, Moritz Vandenhirtz^*, Alain Ryser, Julia E. Vogt
^* denotes shared first authorship

Submitted

ICML 2023 Workshop on Structured Probabilistic Inference & Generative Modeling

Date

30.06.2023

Link Code

Abstract

Constrained clustering has gained significant attention in the field of machine learning as it can leverage prior information on a growing amount of only partially labeled data. Following recent advances in deep generative models, we propose a novel framework for constrained clustering that is intuitive, interpretable, and can be trained efficiently in the framework of stochastic gradient variational inference. By explicitly integrating domain knowledge in the form of probabilistic relations, our proposed model (DC-GMM) uncovers the underlying distribution of data conditioned on prior clustering preferences, expressed as pairwise constraints. These constraints guide the clustering process towards a desirable partition of the data by indicating which samples should or should not belong to the same cluster. We provide extensive experiments to demonstrate that DC-GMM shows superior clustering performances and robustness compared to state-of-the-art deep constrained clustering methods on a wide range of data sets. We further demonstrate the usefulness of our approach on two challenging real-world applications.

Authors

Laura Manduchi, Kieran Chin-Cheong, Holger Michel, Sven Wellmann, Julia E. Vogt

Submitted

Accepted at NeurIPS 2021

Date

14.12.2021

Laura Manduchi^, Ricards Marcinkevics^, Julia E. Vogt
^* denotes shared first authorshipA Deep Variational Approach to Clustering Survival DataContributed talk at AI for Public Health Workshop at ICLR 2021

Abstract

Survival analysis has gained significant attention in the medical domain with many far-reaching applications. Although a variety of machine learning methods have been introduced for tackling time-to-event prediction in unstructured data with complex dependencies, clustering of survival data remains an under-explored problem. The latter is particularly helpful in discovering patient subpopulations whose survival is regulated by different generative mechanisms, a critical problem in precision medicine. To this end, we introduce a novel probabilistic approach to cluster survival data in a variational deep clustering setting. Our proposed method employs a deep generative model to uncover the underlying distribution of both the explanatory variables and the potentially censored survival times. We compare our model to the related work on survival clustering in comprehensive experiments on a range of synthetic, semi-synthetic, and real-world datasets. Our proposed method performs better at identifying clusters and is competitive at predicting survival times in terms of the concordance index and relative absolute error.

Authors

Laura Manduchi^*, Ricards Marcinkevics^*, Julia E. Vogt
^* denotes shared first authorship

Submitted

Contributed talk at AI for Public Health Workshop at ICLR 2021

Date

09.04.2021

Link

Laura Manduchi, Matthias Hüser, Martin Faltys, Julia Vogt, Gunnar Rätsch, Vincent FortuinT-DPSOM - An Interpretable Clustering Method for Unsupervised Learning of Patient Health StatesACM CHIL 2021

Abstract

Generating interpretable visualizations of multivariate time series in the intensive care unit is of great practical importance. Clinicians seek to condense complex clinical observations into intuitively understandable critical illness patterns, like failures of different organ systems. They would greatly benefit from a low-dimensional representation in which the trajectories of the patients’ pathology become apparent and relevant health features are highlighted. To this end, we propose to use the latent topological structure of Self-Organizing Maps (SOMs) to achieve an interpretable latent representation of ICU time series and combine it with recent advances in deep clustering. Specifically, we (a) present a novel way to fit SOMs with probabilistic cluster assignments (PSOM), (b) propose a new deep architecture for probabilistic clustering (DPSOM) using a VAE, and (c) extend our architecture to cluster and forecastclinical states in time series (T-DPSOM). We show that our model achieves superior clustering performance compared to state-of-the-art SOM-based clustering methods while maintaining the favorable visualization properties of SOMs. On the eICU data-set, we demonstrate that T-DPSOM provides interpretable visualizations ofpatient state trajectories and uncertainty estimation. We show that our method rediscovers well-known clinical patient characteristics, such as a dynamic variant of the Acute Physiology And Chronic Health Evaluation (APACHE) score. Moreover, we illustrate how itcan disentangle individual organ dysfunctions on disjoint regions of the two-dimensional SOM map.

Authors

Laura Manduchi, Matthias Hüser, Martin Faltys, Julia Vogt, Gunnar Rätsch, Vincent Fortuin

Submitted

ACM CHIL 2021

Date

04.03.2021

Link

Fabian Laumer, Gabriel Fringeli, Alina Dubatovka, Laura Manduchi, Joachim M. BuhmannDeepHeartBeat: Latent trajectory learning of cardiac cycles using cardiac ultrasoundsbest newcomer award + spotlight talk at Machine Learning for Health Workshop, NeurIPS 2020

Abstract

Echocardiography monitors the heart movement for noninvasive diagnosis of heart diseases. It proves to be of profound practical importance as it combines low-cost portable instrumentation and rapid image acquisition without the risks of ionizing radiation. However, echocardiograms produce high-dimensional, noisy data which frequently proved difficult to interpret. As a solution, we propose a novel autoencoder-based framework, DeepHeartBeat, to learn human interpretable representations of cardiac cycles from cardiac ultrasound data. Our model encodes high dimensional observations by a cyclic trajectory in a lower dimensional space. We show that the learned parameters describing the latent trajectory are well interpretable and we demonstrate the versatility of our model by successfully applying it to various cardiologically relevant tasks, such as ejection fraction prediction and arrhythmia detection. As a result, DeepHeartBeat promises to serve as a valuable assistant tool for automating therapy decisions and guiding clinical care.

Authors

Fabian Laumer, Gabriel Fringeli, Alina Dubatovka, Laura Manduchi, Joachim M. Buhmann

Submitted

best newcomer award + spotlight talk at Machine Learning for Health Workshop, NeurIPS 2020

Date

01.12.2020

Link

Laura Manduchi, Matthias Hueser, Gunnar Raetsch, Vincent FortuinTemporal PSOM - An Interpretable Clustering Method for Tracking Health States in the ICUML4H Workshop, NeurIPS 2019

Abstract

Self-organizing maps (SOMs) have been widely used as a means to visualize latent structure in large amounts of heterogeneous data, in particular as a clustering method. Relatively little work, however, has focused on combining SOMs with deep generative networks for modeling health states, which arise for example in the intensive care unit (ICU). We present Temporal PSOM, a novel neural network architecture that jointly trains a Variational Autoencoder for feature extraction and a probabilistic version of SOM to achieve an interpretable discrete representation of patient health states in the ICU. Experiments on the publicly available eICU data set show significant improvements over state-of-the-art methods in terms of cluster enrichment for current APACHE physiology scores as well as prediction of future physiology states.

Authors

Laura Manduchi, Matthias Hueser, Gunnar Raetsch, Vincent Fortuin

Submitted

ML4H Workshop, NeurIPS 2019

Date

15.12.2019

Dr.

Laura Manduchi

Alumni

Publications