Causal Modeling and Machine Learning

to

The Causal Data Science SIG will hold its next event on Thursday May 16 15.00 – 17.00. The topic will be Causal Modeling and Machine Learning. In this session we will have to guest speakers, Thijs van Ommen (UU) and Wouter van Amsterdam(UMCU) who will discuss discovering causal graphs from data and using prediction models for decision making respectively. There will be plenty of time for questions and discussion, and as usual, a borrel afterwards.

1. Causality and prediction: developing and validating models for decision making (Wouter van Amsterdam)

Causal inference and prediction are often thought of as separate entities, where causal inference is taken to mean estimating a parameter of a distribution and prediction means estimating a conditional expectation. However, in many settings such as health care and advertisement, prediction models are increasingly used for decision making.
In the first part of this talk I show how naively using prediction models for decision making can lead to the unwanted situation of harmful self-fulfilling prophecies: models that cause harm when used for decision making but paradoxically are still found to predict accurately in validation studies. In the second part I describe how we might mitigate these unwanted situations and instead validate and develop models that are useful for decision support. Models for decision support should be evaluated by considering the introduction of the model as an intervention that changes the treatment policy. How then to develop models that lead to good treatment policies? Prediction under intervention models bring causal inference and prediction together by viewing the prediction estimands as conditional expectations with a do-expression. These models provide a rational basis for decision making. Finally, I describe several challenges in developing and validating prediction under intervention models and some mitigation strategies.

2. Causal discovery in the presence of unobserved confounding (Thijs van Ommen)

A causal graph is a useful visual aid that displays the causal relations between a given set of random variables, by showing an arrow whenever one variable is the direct cause of another. Causal discovery is the problem of determining this graph structure from observational data. It may be surprising that this is possible at all. This is due to the key insight that if one variable is not a direct cause of another in a hypothesized causal graph, this implies certain (conditional) independences between those variables. By testing these independences in the data, we may be able to reject many hypothesized causal graphs.
 
In many practical settings, we need to acknowledge the possibility of unmeasured confounding between certain variables. To reflect this, our list of candidate hypotheses should include causal graphs that contain unobserved confounding variables. Conditional independence no longer suffices to tell such graphs apart, and more general types of statistical implications become relevant. I will give a brief overview of different generalizations of conditional independence from current research, namely nested Markov constraints and algebraic constraints, and the settings in which they are known to capture all statistical  implications relevant for causal discovery.

Start date and time
End date and time
Location
Marinus Ruppert Building A
Entrance fee
Free
Registration

Not necessary.