Dr.

Kacper Sokol

Postdoc

E-Mail
kacper.sokol@inf.ethz.ch
Address
Department of Computer Science
CAB G 33.3
Universitätstr. 6
CH – 8092 Zurich, Switzerland
Room
CAB G 33.3

I have joined the Medical Data Science group as a Research Fellow in May 2023. My main research focus is transparency – interpretability and explainability – of data-driven predictive systems based on artificial intelligence and machine learning algorithms.

In the past I have done work on enhancing transparency of predictive models with feasible and actionable counterfactual explanations and robust modular surrogate explainers. I have also introduced Explainability Fact Sheets – a comprehensive taxonomy of AI and ML explainers – and prototyped dialogue-driven interactive explainability systems. Additionally, I designed and led the development of FAT Forensics – an open source fairness, accountability and transparency Python toolkit.

I hold a Master's degree in Mathematics and Computer Science and a doctorate in Computer Science from the University of Bristol, United Kingdom. Before joining the Medical Data Science group at ETH I was a Research Fellow at the ARC Centre of Excellence for Automated Decision-Making and Society, affiliated with the RMIT University in Melbourne, Australia. Prior to that I held numerous research positions at the University of Bristol, working with projects such as REFrAMe, SPHERE and European Union's AI Research Excellence Centre TAILOR.

Authors

Ričards Marcinkevičs*, Kacper Sokol*, Akhil Paulraj, Melinda A. Hilbert, Vivien Rimili, Sven Wellmann, Christian Knorr, Bertram Reingruber, Julia E. Vogt, Patricia Reis Wolfertstetter
* denotes shared first authorship, denotes shared last authorship

Submitted

medRxiv

Date

29.10.2024

LinkDOI

Abstract

Despite significant progress, evaluation of explainable artificial intelligence remains elusive and challenging. In this paper we propose a fine-grained validation framework that is not overly reliant on any one facet of these sociotechnical systems, and that recognises their inherent modular structure: technical building blocks, user-facing explanatory artefacts and social communication protocols. While we concur that user studies are invaluable in assessing the quality and effectiveness of explanation presentation and delivery strategies from the explainees' perspective in a particular deployment context, the underlying explanation generation mechanisms require a separate, predominantly algorithmic validation strategy that accounts for the technical and human-centred desiderata of their (numerical) outputs. Such a comprehensive sociotechnical utility-based evaluation framework could allow to systematically reason about the properties and downstream influence of different building blocks from which explainable artificial intelligence systems are composed – accounting for a diverse range of their engineering and social aspects – in view of the anticipated use case.

Authors

Kacper Sokol, Julia E. Vogt

Submitted

Extended Abstracts of the 2024 ACM Conference on Human Factors in Computing Systems (CHI)

Date

02.05.2024

LinkDOI

Abstract

Abstract Ante-hoc interpretability has become the holy grail of explainable artificial intelligence for high-stakes domains such as healthcare; however, this notion is elusive, lacks a widely-accepted definition and depends on the operational context. It can refer to predictive models whose structure adheres to domain-specific constraints, or ones that are inherently transparent. The latter conceptualisation assumes observers who judge this quality, whereas the former presupposes them to have technical and domain expertise (thus alienating other groups of explainees). Additionally, the distinction between ante-hoc interpretability and the less desirable post-hoc explainability, which refers to methods that construct a separate explanatory model, is vague given that transparent predictive models may still require (post-)processing to yield suitable explanatory insights. Ante-hoc interpretability is thus an overloaded concept that comprises a range of implicit properties, which we unpack in this paper to better understand what is needed for its safe deployment across high-stakes domains. To this end, we outline modelling and explaining desiderata that allow us to navigate its distinct realisations in view of the envisaged application and audience.

Authors

Kacper Sokol, Julia E. Vogt

Submitted

Workshop on Interpretable ML in Healthcare at 2023 International Conference on Machine Learning (ICML)

Date

28.07.2023

LinkDOI

Abstract

Counterfactual explanations are the de facto standard when tasked with interpreting decisions of (opaque) predictive models. Their generation is often subject to algorithmic and domain-specific constraints – such as density-based feasibility for the former and attribute (im)mutability or directionality of change for the latter – that aim to maximise their real-life utility. In addition to desiderata with respect to the counterfactual instance itself, the existence of a viable path connecting it with the factual data point, known as algorithmic recourse, has become an important technical consideration. While both of these requirements ensure that the steps of the journey as well as its destination are admissible, current literature neglects the multiplicity of such counterfactual paths. To address this shortcoming we introduce the novel concept of explanatory multiverse that encompasses all the possible counterfactual journeys and shows how to navigate, reason about and compare the geometry of these paths – their affinity, branching, divergence and possible future convergence – with two methods: vector spaces and graphs. Implementing this (interactive) explanatory process grants explainees more agency by allowing them to select counterfactuals based on the properties of the journey leading to them in addition to their absolute differences.

Authors

Kacper Sokol, Edward Small, Yueqing Xuan

Submitted

Workshop on Counterfactuals in Minds and Machines at 2023 International Conference on Machine Learning (ICML)

Date

28.07.2023

LinkDOI