ACRE: Abstract Causal REasoning Beyond Covariation

Chi Zhang1, Baoxiong Jia1, Mark Edmonds1, Song-Chun Zhu1, Yixin Zhu1
1 Center for Vision, Cognition, Learning, and Autonomy (VCLA), UCLA


Causal induction, i.e., identifying unobservable mechanisms that lead to the observable relations among variables, has played a pivotal role in modern scientific discovery, especially in scenarios with only sparse and limited data. Humans, even young toddlers, can induce causal relationships surprisingly well in various settings despite its notorious difficulty. However, in contrast to the commonplace trait of human cognition is the lack of a diagnostic benchmark to measure causal induction for modern Artificial Intelligence (AI) systems. Therefore, in this work, we introduce the Abstract Causal REasoning (ACRE) dataset for systematic evaluation of current vision systems in causal induction. Motivated by the stream of research on causal discovery in Blicket experiments, we query a visual reasoning system with the following four types of questions in either an independent scenario or an interventional scenario: direct, indirect, screening-off, and backward-blocking, intentionally going beyond the simple strategy of inducing causal relationships by covariation. By analyzing visual reasoning architectures on this testbed, we notice that pure neural models tend towards an associative strategy under their chance-level performance, whereas neuro-symbolic combinations struggle in backward-blocking reasoning. These deficiencies call for future research in models with a more comprehensive capability of causal induction.

Selected Figures

example queries and images
Fig. 1. A sample problem in ACRE. Of the 6 context trials, we devote the first set of 3 panels for an introduction to the Blicket machinery and allow more complex configurations in the second set of panels. Queries are either on independent objects or interventional combinations for an existing trial. In this example, the first query tests causal reasoning from direct evidence, as the gray cube is independently tested and always associated with an activated machine. The second query requires comparing the fourth and fifth trial to realize that the Blicket machine is activated by the cube, not the cylinder, based on indirect evidence. As such, we infer that the red and green cylinders in the sixth trial may not activate the machine because the purple cube can already do so; despite their association with an activated machine only, their Blicketness is backward-blocked in the interventional trial. The cyan cube is screened-off by the gray cube’s Blicketness from probabilistically activating the machine. Of note, the screening-off and the backward-blocking case cannot be solved by covariation.
model architecture
Fig. 2. An illustration of the proposed neuro-symbolic combination (NS-Opt) for ACRE. The neural frontend is responsible for scene parsing. In particular, we use a Mask RCNN to detect objects and classify their attributes as well as the Blicket machine’s state. The parsed results are arranged into data matrices and sent into the causal reasoning backend for optimization. A generalized SEM is learned from context trials during reasoning, which is further used to infer the state of the Blicket machine for each query.


 title={ACRE: Abstract Causal REasoning Beyond Covariation},
 author={Zhang, Chi and Jia, Baoxiong and Edmonds, Mark and Zhu, Song-Chun and Zhu, Yixin},
 booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},