Interesting. That’s a risk when pushing for greater coordination (as you said). If you keep the ability to coordinate the same and build better tools for collective decision-making, would that backfire in such a way?
I imagine collaborative tools would have to make values legible to some extent if they are to be used to analyze anything not-value-neutral. That may push toward legible values, so more like utilitarianism and less like virtue ethics or the mixed bag of moral intuitions that we usually have? But that’s perhaps a separate effect.
But I’m also very interested in improving coordination, so this risk is good to bear in mind.
I think that when you say “better tools for collective decision-making”, I’m thinking of capabilities (predictive accuracy, coordination ability, precommitment mechanisms), but you perhaps seem to be thinking of tools to generate progress related to better values. I’d be interested in seeing some examples of the later which are not of the former.
But you perhaps seem to be thinking of tools to generate progress related to better values
No, I think that unfortunately, the tools I envision are pretty value neutral. I’m thinking of Squiggle, of Ozzie’s ideas for improving prediction markets, and of such things as using better metrics – e.g., SWB instead of QALYs, or expected value of the future instead of probability of extinction.
Hmm, in my case: yes, noish, no. I think I’m really only thinking of making the decisions better, so more predictive accuracy, better Brier scores, or something like that.
In the end I’m of course highly agnostic about how this will be achieved. So this only reflects how I envision this project might turn out to be characterized. Ozzie wants to work that out in more detail, so I’ll leave that to him. :-)
Especially coordination ability may turn out to be affected. More de facto than de jure, I imagine, but when people wanted to collaborate on open source software, their goal was (presumably) to create better software faster and not to improve humanity’s coordination ability. But to do that, they developed version control systems and bug tracking systems, so in the end, they did improve coordination ability. So improving coordination ability is a likely externality of this sort of project, you could say.
For precommitment mechanisms, I can’t think of a way this might be affected either on purpose or accidentally.
Maybe it’ll be helpful to collect a lot of attributes like these and discuss whether we think we’ll need to directly, intentionally affect them, or whether we think we might accidentally affect them, or whether we don’t think they’ll be affected at all. I could easily be overlooking many ways in which they are interconnected.
Mmmh, I sort of want to answer that, say, FTX isn’t value neutral given that the founders are EAs and presumably want to donate their funds to effective causes? Or, like, OpenAI clearly isn’t value neutral given that they’re vetting which applications can use GPT-3. And it might be difficult to come up with an “OpenAI, but evil/value-neutral” organization.
Whereas, say, the CIA’s simple sabotage manual or Machiavelli’s The Prince clearly are value neutral.
The key difference seems to be that in one case the tools are embedded in an altruistic context, and in the second case, they aren’t. So for example, maybe for creating collective decision making tools, one could generate them in such a way that they start out embedded, and they remain embedded in the EA community (but then the scale is reduced).
That’s pretty abstract, so some concrete examples might be:
Using lots of jargon
Using lots of references to previous EA materials
Establishing a lineage, so that powerful tools are only transmitted from master to student
Develop tools that require lots of implicit know-how, rather than steps that can be written down
Okay, I think I understand what you mean. What I meant by “X is value neutral” is something like “The platform FTX is value neutral even if the company FTX is not.” That probably not 100% true, but it’s a pretty good example, especially since I’m quite enamoured of FTX at the moment. OpenAI is all murky and fuzzy and opaque to me, so I don’t know what to think about that.
I think your suggestions go in similar directions as some of mine in various answers, e.g., marketing the product mostly to altruistic actors.
Intentional use of jargon is also something I’ve considered, but it comes at heavy costs, so it’s not my first choice.
References to previous EA materials can work, but I find it hard to think of ways to apply that to Squiggle. But certainly some demo models can be EA-related to make it differentially easier and more exciting for EA-like people to learn how to use it.
Lineage, implicit knowledge, and privacy: High costs again. Making a collaborative system secret would have it miss out on many of the benefits. And enforced openness may also help against bad stuff. But the lineage one is a fun idea I hadn’t thought of! :-D
My conclusion mostly hinges on whether runaway growth is unlikely or extremely unlikely. I’m assuming that it is extremely unlikely, so that we’ll always have time to react when things happen that we don’t want.
So the first thing I’m thinking about now is how to notice when things happen that we don’t want – say, through monitoring the referrers of website views, Google alerts, bounties, or somehow creating value in the form of a community so that everyone who uses the software has a strong incentive to engage with that community.
All in all, the measures I can think of are weak, but if the threat is also fairly unlikely, maybe those weak measures are proportional.
Quick chiming in; I’d agree that this work is relatively value neutral, except for two main points: 1) It seems like those with good values are often rather prone to use better tools, and we could push things more into the hands of good actors than bad ones. Effective Altruists have been quick to adapt many of the best practices (Bayesian reasoning, Superforecasting, probabilistic estimation), but most other groups haven’t. 2) A lot of “values” seem instrumental to me. I think this kind of work could help change the instrumental values of many actors, if it were influential. My current impression is that there would be some level of value convergence that would come with intelligence, though it’s not clear how much of this would happen.
That said, it’s of course possible that better decision-making could be used for bad cases. Hopefully our better decision making abilities as we go on this trajectory could help inform us as to how to best proceed :)
Huh, yeah. I wonder whether this isn’t more of an “inadequate equilibria” type of thing where we use all the right tools that our goals incentivize us to use – an so do all the other groups, except their incentives are weird and different. Then there could easily be groups with uncooperative values but incentives that lead them to use the same tools.
A counterargument could be that a lot of these tools require some expertise, and people who have that expertise are probably not usually desperate enough to have to take some evil job, so most of these people will choose a good/neutral job over and evil job even if the salary is a bit lower.
But I suppose some socially skilled narcissist can just exploit any random modern surrogate religion to recruit good people for something evil by appealing to their morality in twisted ways. So I think it’s a pretty neat mechanism but also one that fails frequently.
Yeah, one of many, many benefits! :-) I don’t think the effect is going to be huge (so that we could rely on it) or tiny. But I’m also hoping that someone will use my system to help me clarify my values. ^^
Interesting. That’s a risk when pushing for greater coordination (as you said). If you keep the ability to coordinate the same and build better tools for collective decision-making, would that backfire in such a way?
I imagine collaborative tools would have to make values legible to some extent if they are to be used to analyze anything not-value-neutral. That may push toward legible values, so more like utilitarianism and less like virtue ethics or the mixed bag of moral intuitions that we usually have? But that’s perhaps a separate effect.
But I’m also very interested in improving coordination, so this risk is good to bear in mind.
I think that when you say “better tools for collective decision-making”, I’m thinking of capabilities (predictive accuracy, coordination ability, precommitment mechanisms), but you perhaps seem to be thinking of tools to generate progress related to better values. I’d be interested in seeing some examples of the later which are not of the former.
No, I think that unfortunately, the tools I envision are pretty value neutral. I’m thinking of Squiggle, of Ozzie’s ideas for improving prediction markets, and of such things as using better metrics – e.g., SWB instead of QALYs, or expected value of the future instead of probability of extinction.
Hmm, in my case: yes, noish, no. I think I’m really only thinking of making the decisions better, so more predictive accuracy, better Brier scores, or something like that.
In the end I’m of course highly agnostic about how this will be achieved. So this only reflects how I envision this project might turn out to be characterized. Ozzie wants to work that out in more detail, so I’ll leave that to him. :-)
Especially coordination ability may turn out to be affected. More de facto than de jure, I imagine, but when people wanted to collaborate on open source software, their goal was (presumably) to create better software faster and not to improve humanity’s coordination ability. But to do that, they developed version control systems and bug tracking systems, so in the end, they did improve coordination ability. So improving coordination ability is a likely externality of this sort of project, you could say.
For precommitment mechanisms, I can’t think of a way this might be affected either on purpose or accidentally.
Maybe it’ll be helpful to collect a lot of attributes like these and discuss whether we think we’ll need to directly, intentionally affect them, or whether we think we might accidentally affect them, or whether we don’t think they’ll be affected at all. I could easily be overlooking many ways in which they are interconnected.
Interesting; this is potentially a problem.
Indeed. :-/ Or would you disagree with my impression that, for example, Squiggle or work on prediction markets is value neutral?
Mmmh, I sort of want to answer that, say, FTX isn’t value neutral given that the founders are EAs and presumably want to donate their funds to effective causes? Or, like, OpenAI clearly isn’t value neutral given that they’re vetting which applications can use GPT-3. And it might be difficult to come up with an “OpenAI, but evil/value-neutral” organization.
Whereas, say, the CIA’s simple sabotage manual or Machiavelli’s The Prince clearly are value neutral.
The key difference seems to be that in one case the tools are embedded in an altruistic context, and in the second case, they aren’t. So for example, maybe for creating collective decision making tools, one could generate them in such a way that they start out embedded, and they remain embedded in the EA community (but then the scale is reduced).
That’s pretty abstract, so some concrete examples might be:
Using lots of jargon
Using lots of references to previous EA materials
Establishing a lineage, so that powerful tools are only transmitted from master to student
Develop tools that require lots of implicit know-how, rather than steps that can be written down
Strong norms for privacy
Okay, I think I understand what you mean. What I meant by “X is value neutral” is something like “The platform FTX is value neutral even if the company FTX is not.” That probably not 100% true, but it’s a pretty good example, especially since I’m quite enamoured of FTX at the moment. OpenAI is all murky and fuzzy and opaque to me, so I don’t know what to think about that.
I think your suggestions go in similar directions as some of mine in various answers, e.g., marketing the product mostly to altruistic actors.
Intentional use of jargon is also something I’ve considered, but it comes at heavy costs, so it’s not my first choice.
References to previous EA materials can work, but I find it hard to think of ways to apply that to Squiggle. But certainly some demo models can be EA-related to make it differentially easier and more exciting for EA-like people to learn how to use it.
Lineage, implicit knowledge, and privacy: High costs again. Making a collaborative system secret would have it miss out on many of the benefits. And enforced openness may also help against bad stuff. But the lineage one is a fun idea I hadn’t thought of! :-D
My conclusion mostly hinges on whether runaway growth is unlikely or extremely unlikely. I’m assuming that it is extremely unlikely, so that we’ll always have time to react when things happen that we don’t want.
So the first thing I’m thinking about now is how to notice when things happen that we don’t want – say, through monitoring the referrers of website views, Google alerts, bounties, or somehow creating value in the form of a community so that everyone who uses the software has a strong incentive to engage with that community.
All in all, the measures I can think of are weak, but if the threat is also fairly unlikely, maybe those weak measures are proportional.
Quick chiming in;
I’d agree that this work is relatively value neutral, except for two main points:
1) It seems like those with good values are often rather prone to use better tools, and we could push things more into the hands of good actors than bad ones. Effective Altruists have been quick to adapt many of the best practices (Bayesian reasoning, Superforecasting, probabilistic estimation), but most other groups haven’t.
2) A lot of “values” seem instrumental to me. I think this kind of work could help change the instrumental values of many actors, if it were influential. My current impression is that there would be some level of value convergence that would come with intelligence, though it’s not clear how much of this would happen.
That said, it’s of course possible that better decision-making could be used for bad cases. Hopefully our better decision making abilities as we go on this trajectory could help inform us as to how to best proceed :)
Huh, yeah. I wonder whether this isn’t more of an “inadequate equilibria” type of thing where we use all the right tools that our goals incentivize us to use – an so do all the other groups, except their incentives are weird and different. Then there could easily be groups with uncooperative values but incentives that lead them to use the same tools.
A counterargument could be that a lot of these tools require some expertise, and people who have that expertise are probably not usually desperate enough to have to take some evil job, so most of these people will choose a good/neutral job over and evil job even if the salary is a bit lower.
But I suppose some socially skilled narcissist can just exploit any random modern surrogate religion to recruit good people for something evil by appealing to their morality in twisted ways. So I think it’s a pretty neat mechanism but also one that fails frequently.
Yeah, one of many, many benefits! :-) I don’t think the effect is going to be huge (so that we could rely on it) or tiny. But I’m also hoping that someone will use my system to help me clarify my values. ^^
Deferring to future versions of us: Yep!