A similar idea is by the way discussed in a post by Jaime Sevilla on the limits of causal discovery: https://​​towardsdatascience.com/​​the-limits-of-graphical-causal-discovery-92d92aed54d6
Related to your causality comment above, two days ago I submitted a research proposal on Causal Representation Learning for AI Safety. You may want to see it here: https://​​www.lesswrong.com/​​posts/​​5BkEoJFEqQEWy9GcL/​​an-open-philanthropy-grant-proposal-causal-representation
A similar idea is by the way discussed in a post by Jaime Sevilla on the limits of causal discovery: https://​​towardsdatascience.com/​​the-limits-of-graphical-causal-discovery-92d92aed54d6
Related to your causality comment above, two days ago I submitted a research proposal on Causal Representation Learning for AI Safety. You may want to see it here: https://​​www.lesswrong.com/​​posts/​​5BkEoJFEqQEWy9GcL/​​an-open-philanthropy-grant-proposal-causal-representation