I’m totally on board with the constraint that the future be good, that it be broadly appealing rather than just good-according-to-our-esoteric-morality.
What I’m worried about is that this contest will end up being a tool for self-deception (like Czynski said) because the goodness of the future correlates with important variables we can’t really influence, like takeoff speeds and difficulty of the alignment problem and probability of warning shots and many other things. So in order to describe a good future, people will fiddle with the knobs of those important variables so that they are on their conducive-to-good settings rather than their most probable settings.
Thus the distribution of stories we get at the end will be unrealistic, not just in that it’ll be more optimistic/good than is likely (that part is fine, that’s by design) but in a variety of other ways as well, ways that we can’t change. So then insofar as we use these scenarios as targets to aim towards, we will fail because of the underlying unchangeable variables that have been set to their optimistic settings in these scenarios.
Analogy: Let’s say it’s January 2020 and we are trying to prepare the world for COVID. We ask people to write a bunch of optimistic stories in which very few people die and very little economic disruption happens. What’ll people write? Stories in which COVID isn’t that infectious, in which it isn’t that deadly, in which masks work better than they do in reality, in which vaccines work better, in which compliance with lockdowns is higher… In a thousand little ways these stories will be unrealistically optimistic, and in hundreds of those ways, they’ll be unrealistic in ways we can’t change. So the policies we’d make on the basis of these stories would be bad policies. (For example, we might institute lockdowns that cause lots of economic and psychological damage without saving many lives at all, because the lockdowns were harsh enough to be disruptive but not harsh enough to get the virus under control; we didn’t think they needed to be any harsher because of the aforementioned optimistic settings of the variables.)
My overall advice would be: Explain this problem to the contestants. Make it clear that their goal is to depict a realistic future, and then depict a series of actions that steer that realistic future into a good, broadly appealing state. It’s cheating if you get to the good state by wishful thinking, i.e. by setting a bunch of variables like takeoff speeds etc. to “easy mode.” It’s not cheating if the actions you propose are very difficult to pull off, because the point of the project is to give us something to aim for and we can still productively aim for it even if it’s difficult.
Returning to this thread to note that I eventually did enter the contest, and was selected as a finalist! I tried to describe a world where improved governance / decisionmaking technology puts humanity in a much better position to wisely and capably manage the safe development of aligned AI.
https://worldbuild.ai/W-0000000088/
The biggest sense in which I’m “playing on easy mode” is that in my story I make it sound like the adoption of prediction markets and other new institutions was effortless and inevitable, versus in the real world I think improved governance is achievable but is a bit of a longshot to actually happen; if it does, it will be because a lot of people really worked hard on it. But that effort and drive is the very thing I’m hoping to help inspire/motivate with my story, which I feel somehow mitigates the sin of unrealism.
Overall, I am actually suprised at how dystopian and pessimistic many of the stories are. (Unfortunately they are mostly not pessimistic about alignment; rather there are just a lot of doomer vibes about megacorps and climate crisis.) So I don’t think people went overboard in the direction of telling unrealistic tales about longshot utopias—except to the extent that many contestants don’t even realize that alignment is a scary and difficult challenge, thus the stories are in that sense overly-optimistic by default.
Totally agree here that what’s interesting is the ways in which things turn out well due to agency rather than luck. Of course if things turn out well, it’s likely to be in part due to luck — but as you say that’s less useful to focus on. We’ll think about whether it’s worth tweaking the rules a bit to emphasize this.
Thanks! I think explaining the problem to the contestants might go a long way. You could also just announce that realism (about unchangeable background variables, not about actions taken) is an important part of the judging criteria, and that submissions will be graded harshly if they seem to be “playing on easy mode.” EDIT: Much more important than informing the contestants though is informing the people who are trying to learn from this experiment. If you are (for example) going to be inspired by some of these visions and work to achieve them in the real world… you’d better make sure the vision wasn’t playing on easy mode!
I think though the way the purpose of this exercise is understood is more about characterizing an utopia, and not about trying to explain how to solve alignment in a world where a singularity is in the cards.
I’m totally on board with the constraint that the future be good, that it be broadly appealing rather than just good-according-to-our-esoteric-morality.
What I’m worried about is that this contest will end up being a tool for self-deception (like Czynski said) because the goodness of the future correlates with important variables we can’t really influence, like takeoff speeds and difficulty of the alignment problem and probability of warning shots and many other things. So in order to describe a good future, people will fiddle with the knobs of those important variables so that they are on their conducive-to-good settings rather than their most probable settings.
Thus the distribution of stories we get at the end will be unrealistic, not just in that it’ll be more optimistic/good than is likely (that part is fine, that’s by design) but in a variety of other ways as well, ways that we can’t change. So then insofar as we use these scenarios as targets to aim towards, we will fail because of the underlying unchangeable variables that have been set to their optimistic settings in these scenarios.
Analogy: Let’s say it’s January 2020 and we are trying to prepare the world for COVID. We ask people to write a bunch of optimistic stories in which very few people die and very little economic disruption happens. What’ll people write? Stories in which COVID isn’t that infectious, in which it isn’t that deadly, in which masks work better than they do in reality, in which vaccines work better, in which compliance with lockdowns is higher… In a thousand little ways these stories will be unrealistically optimistic, and in hundreds of those ways, they’ll be unrealistic in ways we can’t change. So the policies we’d make on the basis of these stories would be bad policies. (For example, we might institute lockdowns that cause lots of economic and psychological damage without saving many lives at all, because the lockdowns were harsh enough to be disruptive but not harsh enough to get the virus under control; we didn’t think they needed to be any harsher because of the aforementioned optimistic settings of the variables.)
My overall advice would be: Explain this problem to the contestants. Make it clear that their goal is to depict a realistic future, and then depict a series of actions that steer that realistic future into a good, broadly appealing state. It’s cheating if you get to the good state by wishful thinking, i.e. by setting a bunch of variables like takeoff speeds etc. to “easy mode.” It’s not cheating if the actions you propose are very difficult to pull off, because the point of the project is to give us something to aim for and we can still productively aim for it even if it’s difficult.
Returning to this thread to note that I eventually did enter the contest, and was selected as a finalist! I tried to describe a world where improved governance / decisionmaking technology puts humanity in a much better position to wisely and capably manage the safe development of aligned AI. https://worldbuild.ai/W-0000000088/
The biggest sense in which I’m “playing on easy mode” is that in my story I make it sound like the adoption of prediction markets and other new institutions was effortless and inevitable, versus in the real world I think improved governance is achievable but is a bit of a longshot to actually happen; if it does, it will be because a lot of people really worked hard on it. But that effort and drive is the very thing I’m hoping to help inspire/motivate with my story, which I feel somehow mitigates the sin of unrealism.
Overall, I am actually suprised at how dystopian and pessimistic many of the stories are. (Unfortunately they are mostly not pessimistic about alignment; rather there are just a lot of doomer vibes about megacorps and climate crisis.) So I don’t think people went overboard in the direction of telling unrealistic tales about longshot utopias—except to the extent that many contestants don’t even realize that alignment is a scary and difficult challenge, thus the stories are in that sense overly-optimistic by default.
Totally agree here that what’s interesting is the ways in which things turn out well due to agency rather than luck. Of course if things turn out well, it’s likely to be in part due to luck — but as you say that’s less useful to focus on. We’ll think about whether it’s worth tweaking the rules a bit to emphasize this.
Thanks! I think explaining the problem to the contestants might go a long way. You could also just announce that realism (about unchangeable background variables, not about actions taken) is an important part of the judging criteria, and that submissions will be graded harshly if they seem to be “playing on easy mode.” EDIT: Much more important than informing the contestants though is informing the people who are trying to learn from this experiment. If you are (for example) going to be inspired by some of these visions and work to achieve them in the real world… you’d better make sure the vision wasn’t playing on easy mode!
I think though the way the purpose of this exercise is understood is more about characterizing an utopia, and not about trying to explain how to solve alignment in a world where a singularity is in the cards.