MichaelDickens answers Community Polls on Alignment Controversies

MichaelDickens 16 Jun 2026 22:06 UTC
5 points
1 ∶ 0

Robust alignment requires alignment-relevant intervention during pretraining

I’d say this is the wrong question. Like, I do not expect that any current alignment approach is going to work. If we do ever figure out what works, it will not look like “pretraining” or “post-training”, it will be something completely different.

Although I guess you could call that “pretraining”?
- Jasmine Brazilek 16 Jun 2026 22:48 UTC
  1 point
  0 ∶ 0
  Parent
  Thanks Michael, we avoided mentioning post-training to imply that “new paradigm needed” would also count on the “disagree” side of the spectrum. In other words, “disagree” on this question would mean either “post-training is sufficient” or “new paradigms are needed/sufficient”.