More evidence X-risk amplifies action against current AI harms

Link post

The linked article is my research project for the AI Safety Fundamentals course. As I waited for active moderator approval of the preprint, Erich Grunewald beat me to writing a post with the same point. Luckily, our arguments largely complement each other, which is nicely symbolic of the whole debate.

In summary:

  • AI ethics is more focused on existing problems, AI safety on those arising in the near future. Since these communities build on different intellectual traditions, they view AI risks through different aesthetics.

    • E.g. AI safety speaks in terms of utility functions, agents and incentives, AI ethics speaks in terms of fairness or accountability.

  • In terms of policy recommendations, these differences don’t seem to matter. Unfortunately, the combination of political attention with the tough competition within academia and Big Tech involvement heighten suspicion and incentivize the creation of intellectual coalitions that often form around aesthetics, as the simplest common denominator.

    • This may lead to the feeling that people focused on different problems divert attention from what’s really important

  • However, this micropolitics masks the fact that, in practice, AI ethics and AI safety are highly complimentary and both benefit from the shared spotlight of attention.

  • This was well demonstrated with the EU AI act, as:

    1. The act demonstrated there is a consensus on the meta-principles that should guide AI policy.

      1. Particularly the principle that AI developers need to provide sufficient evidence that they are taking reasonable measures to ensure that their technology is beneficial

      2. Therefore, there doesn’t need to be an agreement regarding p(doom). If it’s obvious that an advanced AI is extremely unlikely to cause a catastrophe, it should be easy to demonstrate. In such a case, the policy rightfully wouldn’t slow AI development down. However, if an AI has destructive capabilities and there aren’t arguments or experiments that can demonstrate safety, it’s good if regulation poses an obstacle to its development.

    2. The act demonstrated the two perspectives offer different reasons for a plethora of important policies.

      1. Both safety and fairness requires a corporate governance that ensures transparent, controllable algorithms and a practical accountability of digital firms

      2. Both perspectives highlight the harms caused by the social power of social media, addictions, manipulation, surveillance, social credit systems or autonomous weapons

    3. The act demonstrated that the increased attention around AI risks gives weight to the voices from both sides.

      1. Studies suggest that mentioning scientific uncertainty (one stemming from known gaps in knowledge) and technical uncertainty (one inherent to statistical models) in science communication has neutral or positive effects on the trust of the source. However, consensus uncertainty, stemming from differences in opinions, has clearly negative effects towards both sides of a debate, particularly if it’s heated. Importantly, it’s a question of framing whether uncertainty stems from a gap in knowledge or differences among scientists. Therefore, merely friendlier relationships seem beneficial, in order to promote policies in the intersection, as not knowing who to trust fosters inaction.

      2. In practice, we might have witnessed this when a group of MEPs wrote their own version of the FLI open letter, which possibly sped up the EU AI act, even though the group’s members expressed skepticism towards the FLI letter x-risk concerns. This may have been allowed by the FLI’s letter openness towards both AI ethics and AI safety-based concerns.

  • I ran a tiny survey (n=82) to explore the effects of interaction of AI safety and AI ethics attitudes. I found out the concern for AI safety and AI bias correlated positively (r = .28). More importantly, when respondents were first asked to think about AI safety, their concern regarding AI bias was not lower in any of the measured dimensions (perceived significance, support for policy, support of research funding) - and vice versa. The interaction effect was only significant in one direction—people who were first asked about AI safety reported higher support for policy targeting AI bias (ß = .23).

  • More evidence based on web interest and themes of other policy & funding initiatives in Erich’s post