I’m replying quickly to this as my questions closely align with the above to save the authors two responses; but admittedly I haven’t read this in full yet.
Next, we conducted research and developed 3-5-page profiles on 41 institutions. Each profile covered the institution’s organizational structure, expected or hypothetical impact on people’s lives in both typical and extreme scenarios, future trajectory, and capacity for change.
Can you explain more about ‘capacity for change’ and what exactly that entailed in the write-ups? I ask because looking at the final top institutions and reading their descriptions, it feels like the main leverage is driven by ‘expected of hypothetical impact on people’s lives in both typical and extreme scenarios’, and less by ‘capacity for change’.
It seems to be a given that EAs working in one of these institutions (e.g. DeepMind) or concrete channels to influence (e.g. thinktanks to the CCP Politburo) constitute ‘capacity for change’ within the organisation, but I would argue that that in fact is driven by a plethora of internal and external factors to the organisation. External might be market forces driving an organisations dominance and threatening its decline (e.g. Amazon), and internal forces like culture and systems (e.g. Facebook / Meta’s resistance to employee action). In fact, the latter example really challenges why this organisation would be in the top institutions if ‘capacity for change’ has been well developed.
For such a powerful institution, the Executive Office of the President is capable of shifting both its structure and priorities with unusual ease. Every US President comes into the office with wide discretion over how to set their agenda and select their closest advisors. Since these appointments are typically network-driven, positioning oneself to be either selected as or asked to recommend a senior advisor in the administration can be a very high-impact career track.
Equally, when it comes to capacity for change this is both a point in favour and against, as such structure and priorities are by definition not robust / easily changed by the next administration.
Basically, it’s really hard to get a sense of whether the analysis captured these broader concerns from the write-up above. If it didn’t, I would hope this would be a next step in the analysis as it would be hugely useful and also add a huge deal more novel insights both from a research perspective and in terms of taking action.
Also curious about how heavy this is weighted towards AI institutions—and I work in the field of AI governance so I’m not a sceptic. Does this potentially tell us anything about the methodology chosen, or experts enlisted?
EDIT: additional point around Executive Office of the President of US
Yep, I agree, this is complex stuff. The specifics on what interventions might be most promising to support were largely out of scope for this preliminary analysis—this is all part of a larger arc of work and those elements will come later this year (see diagram below, with apologies for the janky graphics).
I would offer a few points that I think are worth keeping in mind when considering questions like these:
Institutional improvement opportunities are highly contingent. As you pointed out, lots of internal and external factors drive an institution’s capacity for change. Often it’s easiest to drive change when lots of other changes need to happen anyway; I see Jan Kulveit’s sequence on practical longtermism during the COVID pandemic as an illustration of this. One implication is that, absent a very high degree of clarity about what’s going on with an organization (see point #2 below), it probably makes more sense for people interested in this space to prioritize developing general capacities that can be useful in a number of different situations rather than jump to bets on very specific pathways to impact, as the latter can easily be upset by a changing landscape.
Judging institutional tractability usually requires insider knowledge: One of the general capacities I think is particularly valuable in these situations is developing a detailed understanding of what’s happening inside an institution. To take the example of the employee activism at Meta, you’re right that it’s had little visible impact so far, but it’s hard to know from the outside if that means that path is just hopeless and we should try something else, or if alternatively it’s changed the circumstances surrounding the organization quite a bit and greatly increased the success probability of future interventions. I think the only way you would be able to judge this with precision is to get to know an organization like that really, really well, and that takes a lot more time on a per-organization basis than we’ve invested to date. One of the main purposes of this article was to help us decide which organizations are worth that investment. For this same reason, to echo Nathan’s point below, the estimates of tractability and neglectedness for most institutions on our list are pretty fuzzy because we don’t have that level of inside knowledge for most of them. But the nature of estimating things quantitatively is that you do the best you can with the information you do have and communicate your remaining uncertainties honestly, and that’s what we tried to do here.
Institutions are multifaceted: Most of the top institutions on our list have many different avenues for impacting people’s lives. The US presidency will indeed be among the most important players in the AI governance conversation, but it’s among the central actors on a host of other policy issues and cause areas as well. So I wouldn’t over-update on the intersection between AI heavyweights and this list; with the exception of OpenAI and DeepMind, my personal takeaway is more in the direction of “institutions that are powerful anyway will also be really important to AI governance” more than “the only institutional improvement conversations worth having are about AI governance.”
I will be completely honest and share that I downvoted this response as I personally felt it was more defensive than engaging with the critiques, and didn’t engage with specific points that were asked—for example, capacity for change. That said, I recognise I’m potentially coming late to the party in sharing my critiques of the approach / method, and in that sense I feel bad about sharing them now. But usually authors are ultimately open to this input, and I suspect this group is no different :)
A few further points:
I understand the premise of “our unit of analysis was the institutions themselves, so we could focus in on the most likely to be ‘high leverage’ to then gain the contextual understanding required to make a difference”. I would not be surprised if the next step proves less fruitful than expected for a number of reasons, such as:
difficult to gain access to the ‘inner rings’ to ascertain this knowledge on how to make impact
the ‘capacity for change’ / ‘neglectedness, tractability’ turns out to be a significantly lower leverage point within those institutions, which potentially reinforces the point we might have made a reasonable guess at: that impact / scale can be inversely correlated with flexibility / capacity for change / tractability / etc
I get a sense from having had a brief look at the methodology that insider knowledge from making change in these organisations could have been woven in earlier; either by talking to EAs / EA aligned types working within government or big tech companies or whatever else. This would have been useful for deciding what unit of analysis should be, or just sense-checking ‘will what we produce be useful?’
If this was part of the methodology, my apologies: it’s on me for skim-reading.
I’m a bit concerned by choosing to build a model for this, given as you say this work is highly contextual and we don’t have most of this context. My main concerns are something like...:
quant models are useful where there are known and quantifiable distinguishers between different entities, and where you have good reason to think you can:
weight the importance of those distinguishers accordingly
change the weights of those distinguishers as new information comes in
but as Ian says, ‘capacity for change’ in highly contextual, and a critical factor in which organisations should be prioritised
however, the piece above reads like ‘capacity for change’ was factored into the model. If so, how? And why now when there’s lowoer info on it?
just from a time resource perspective, models cost a lot, and sometimes are significantly less efficient than a qualitative estimate especially where things are highly contextual; so I’m keen to learn more about what drove this
This is all intended to be constructive even if challenging. I work in these kinds of contexts, so this work going well is meaningful to me, and I want to see the results as close to ground truth and actionable as possible. Admittedly, I don’t consider the list of top institutions necessarily actionable as things stand or that they provide particularly new information, so I think the next step could add a lot of value.
Constructive critique is always welcome! I’m sorry the previous response didn’t sufficiently engage with your points. I guess the main thing I didn’t address directly was your question of “were concerns like these taken into account,” and the answer is “basically yes,” although not to the level of detail or precision that would be ideal going forward. Some of the prompts we asked our volunteer researchers to consider included:
How are decisions made in this institution? Who has ultimate authority? Who has practical authority?
How and under what circumstances does this institution make changes to a) its overall priorities and b) the way that it operates? Is it possible to influence relevant subdivisions of this institution even if its overall leadership or culture is resistant to change?
Is this institution likely to become more or less important on the world stage in the next 10 years? What about the next 100? Please note any relevant contingencies, e.g. potential mergers or splits.
FYI, the full model is now posted in my response above to MathiasKB; it sounds like it might be helpful for you to take a look if you have further questions.
Continuing on:
the ‘capacity for change’ / ‘neglectedness, tractability’ turns out to be a significantly lower leverage point within those institutions, which potentially reinforces the point we might have made a reasonable guess at: that impact / scale can be inversely correlated with flexibility / capacity for change / tractability / etc
As I mentioned in my response to weeatquince, this inverse relationship is already baked into the analysis in the sense that absent institution-specific evidence to the contrary, we assumed that a larger and more bureaucratic or socially-distant-from-EA organization would be harder to influence. I really want to emphasize that the list in the article is not just a list of the most important institutions, it is the list we came up with after we took considerations about tractability into account. Now, it is entirely possible that we underrated those concerns overall. Still, I suspect you may be overrating them—for example, just checking my LinkedIn I find that I have seven connections in common with the current Chief of Staff to the President of the United States...and not because I have been consciously cultivating those connections, but simply because our social and professional circles are not that distant from each other. And I’m just one person: when you combine all of the networks of everyone involved in EA and everyone connected to EIP and the improving institutional decision-making community more broadly, that’s a lot of potential network reach.
I get a sense from having had a brief look at the methodology that insider knowledge from making change in these organisations could have been woven in earlier; either by talking to EAs / EA aligned types working within government or big tech companies or whatever else. This would have been useful for deciding what unit of analysis should be, or just sense-checking ‘will what we produce be useful?’
I’m not sure what you mean by “unit of analysis” in this context, could you give an example? In an ideal world, I think you’re right that the project would have benefited from more engagement with the types of folks you’re talking about. However, members of the project team did include a person working at one of the big tech companies on our list, another person working at a top consulting firm, another person who is at the World Bank, a couple of people who work for the UK government, etc. And one of our advisors chairs an external working group at the WHO. So we did have some of the kinds of access you’re talking about, which I think is a decent start given that we were putting all of this together essentially on a $25k grant.
I’m a bit concerned by choosing to build a model for this, given as you say this work is highly contextual and we don’t have most of this context.
Yeah, this is a totally fair observation, and I’ll confess that at one point I considered ditching the model entirely and just publishing the survey results. In the end, however, I think it proved really useful to us. I’m a big believer in the principle that high uncertainty need not preclude a quantitative approach (Doug Hubbard’s book How to Measure Anything is a really useful resource on this topic—see Luke Muehlhauser’s summary/review here). I personally got a lot out of fiddling with the numbers and learning how robust they were to different assumptions, and that helped give me the confidence to include it in our analysis. It’s not to say that I don’t expect changes to the topline takeaways once we get better information—I do expect them—but I’d be moderately surprised if they were so drastic that it makes this earlier analysis look completely silly in retrospect.
@IanDavidMoss, thanks for the reply. I would love if you could go a little deeper into what is an institution to you. How do you characterize it, and why is this nomenclature important? I just would like to go back to my apples to apples comparison question. My first instinct is that comparing Meta to Blackrock to the Bill and Melinda Gates foundation to the Office of the President of the USA to the CCP Central Commitee is going to create some false parallels and misunderstandings of degree of importance or possibility for change ( I will just call this power).
I would suspect that the amount of power of the President of the USA is orders of magnitude greater than the Bill and Melinda Gates foundation. So while they might be on a long list together, they are a bit like comparing our moon and the Sun. So we would have a magnitude issue.
In addition we would have a capabilites issue. The office of the President is much more powerful than Mark Zuckerberg, I would argue, but Meta can also do things that the President could only dream of. Facebook has been an incredible tool for spreading information, both for good and nefarious purposes. The US government could only wish for that ability to reach peoples’ brains.
These thoughts lead me to imagine what you final recommendations will look like, and I am not sure. I suspect you will discover that you end up making very specific suggestions for different insitutions. Other than a standard 80k be flexible and build up your career capital suggestion, I think it might be difficult to give thematic recommendations that are equally useful in all of the types of organizations you tackle here.
Hi Charlie, if you haven’t already read the post we wrote last year introducing the prioritization framework used in this article, I recommend you do so as it goes over many of these theoretical questions in depth. In that post we offered the following characterization of an institution:
The definition of an institution is not completely standardized across disciplines or academic fields, but in this context we mean a formally organized group of people, usually delineated by some sort of legal entity (or an interconnected set of such entities) in the jurisdiction(s) in which that group operates. In tying our definition to explicit organizations, we are not including more nebulous concepts like “the media” or “the public,” even as we recognize how broader societal contexts (such as behavior norms, history, etc.) help define and constrain the choices available to institutions and the people working within them. By “key” or “powerful” institutions, we are referring in particular to institutions whose actions can significantly influence the circumstances, attitudes, beliefs, and/or behaviors of a large number of morally relevant beings.
As you can see, under this definition there is nothing particularly weird about grouping government and non-government organizations together; they are both formally organized groups of people delineated by legal entities. And even if you were to limit your analysis to, say, tech companies, you would still face the same issue of vastly divergent magnitudes and capabilities within that set of organizations and have to figure out a way to derive meaning from that. Basically, I don’t disagree at all with the observations you’re making, but my takeaway is “yes, and this is all the more reason why a holistic and cross-sector analysis is relevant and valuable,” not “well, I guess we shouldn’t bother because this is all too hard.”
I suspect you will discover that you end up making very specific suggestions for different insitutions.
This has always been the plan. I’ve believed and argued for a long time that while institutions have some common features and problems, identifying the most actionable and promising levers for change requires a highly tailored approach. And this type of work seems to me more neglected within EA than more general, intervention-level analysis (e.g., here and here). So I think we are on the same page.
I’m replying quickly to this as my questions closely align with the above to save the authors two responses; but admittedly I haven’t read this in full yet.
Can you explain more about ‘capacity for change’ and what exactly that entailed in the write-ups? I ask because looking at the final top institutions and reading their descriptions, it feels like the main leverage is driven by ‘expected of hypothetical impact on people’s lives in both typical and extreme scenarios’, and less by ‘capacity for change’.
It seems to be a given that EAs working in one of these institutions (e.g. DeepMind) or concrete channels to influence (e.g. thinktanks to the CCP Politburo) constitute ‘capacity for change’ within the organisation, but I would argue that that in fact is driven by a plethora of internal and external factors to the organisation. External might be market forces driving an organisations dominance and threatening its decline (e.g. Amazon), and internal forces like culture and systems (e.g. Facebook / Meta’s resistance to employee action). In fact, the latter example really challenges why this organisation would be in the top institutions if ‘capacity for change’ has been well developed.
Equally, when it comes to capacity for change this is both a point in favour and against, as such structure and priorities are by definition not robust / easily changed by the next administration.
Basically, it’s really hard to get a sense of whether the analysis captured these broader concerns from the write-up above. If it didn’t, I would hope this would be a next step in the analysis as it would be hugely useful and also add a huge deal more novel insights both from a research perspective and in terms of taking action.
Also curious about how heavy this is weighted towards AI institutions—and I work in the field of AI governance so I’m not a sceptic. Does this potentially tell us anything about the methodology chosen, or experts enlisted?
EDIT: additional point around Executive Office of the President of US
Yep, I agree, this is complex stuff. The specifics on what interventions might be most promising to support were largely out of scope for this preliminary analysis—this is all part of a larger arc of work and those elements will come later this year (see diagram below, with apologies for the janky graphics).
I would offer a few points that I think are worth keeping in mind when considering questions like these:
Institutional improvement opportunities are highly contingent. As you pointed out, lots of internal and external factors drive an institution’s capacity for change. Often it’s easiest to drive change when lots of other changes need to happen anyway; I see Jan Kulveit’s sequence on practical longtermism during the COVID pandemic as an illustration of this. One implication is that, absent a very high degree of clarity about what’s going on with an organization (see point #2 below), it probably makes more sense for people interested in this space to prioritize developing general capacities that can be useful in a number of different situations rather than jump to bets on very specific pathways to impact, as the latter can easily be upset by a changing landscape.
Judging institutional tractability usually requires insider knowledge: One of the general capacities I think is particularly valuable in these situations is developing a detailed understanding of what’s happening inside an institution. To take the example of the employee activism at Meta, you’re right that it’s had little visible impact so far, but it’s hard to know from the outside if that means that path is just hopeless and we should try something else, or if alternatively it’s changed the circumstances surrounding the organization quite a bit and greatly increased the success probability of future interventions. I think the only way you would be able to judge this with precision is to get to know an organization like that really, really well, and that takes a lot more time on a per-organization basis than we’ve invested to date. One of the main purposes of this article was to help us decide which organizations are worth that investment. For this same reason, to echo Nathan’s point below, the estimates of tractability and neglectedness for most institutions on our list are pretty fuzzy because we don’t have that level of inside knowledge for most of them. But the nature of estimating things quantitatively is that you do the best you can with the information you do have and communicate your remaining uncertainties honestly, and that’s what we tried to do here.
Institutions are multifaceted: Most of the top institutions on our list have many different avenues for impacting people’s lives. The US presidency will indeed be among the most important players in the AI governance conversation, but it’s among the central actors on a host of other policy issues and cause areas as well. So I wouldn’t over-update on the intersection between AI heavyweights and this list; with the exception of OpenAI and DeepMind, my personal takeaway is more in the direction of “institutions that are powerful anyway will also be really important to AI governance” more than “the only institutional improvement conversations worth having are about AI governance.”
I will be completely honest and share that I downvoted this response as I personally felt it was more defensive than engaging with the critiques, and didn’t engage with specific points that were asked—for example, capacity for change. That said, I recognise I’m potentially coming late to the party in sharing my critiques of the approach / method, and in that sense I feel bad about sharing them now. But usually authors are ultimately open to this input, and I suspect this group is no different :)
A few further points:
I understand the premise of “our unit of analysis was the institutions themselves, so we could focus in on the most likely to be ‘high leverage’ to then gain the contextual understanding required to make a difference”. I would not be surprised if the next step proves less fruitful than expected for a number of reasons, such as:
difficult to gain access to the ‘inner rings’ to ascertain this knowledge on how to make impact
the ‘capacity for change’ / ‘neglectedness, tractability’ turns out to be a significantly lower leverage point within those institutions, which potentially reinforces the point we might have made a reasonable guess at: that impact / scale can be inversely correlated with flexibility / capacity for change / tractability / etc
I get a sense from having had a brief look at the methodology that insider knowledge from making change in these organisations could have been woven in earlier; either by talking to EAs / EA aligned types working within government or big tech companies or whatever else. This would have been useful for deciding what unit of analysis should be, or just sense-checking ‘will what we produce be useful?’
If this was part of the methodology, my apologies: it’s on me for skim-reading.
I’m a bit concerned by choosing to build a model for this, given as you say this work is highly contextual and we don’t have most of this context. My main concerns are something like...:
quant models are useful where there are known and quantifiable distinguishers between different entities, and where you have good reason to think you can:
weight the importance of those distinguishers accordingly
change the weights of those distinguishers as new information comes in
but as Ian says, ‘capacity for change’ in highly contextual, and a critical factor in which organisations should be prioritised
however, the piece above reads like ‘capacity for change’ was factored into the model. If so, how? And why now when there’s lowoer info on it?
just from a time resource perspective, models cost a lot, and sometimes are significantly less efficient than a qualitative estimate especially where things are highly contextual; so I’m keen to learn more about what drove this
This is all intended to be constructive even if challenging. I work in these kinds of contexts, so this work going well is meaningful to me, and I want to see the results as close to ground truth and actionable as possible. Admittedly, I don’t consider the list of top institutions necessarily actionable as things stand or that they provide particularly new information, so I think the next step could add a lot of value.
Constructive critique is always welcome! I’m sorry the previous response didn’t sufficiently engage with your points. I guess the main thing I didn’t address directly was your question of “were concerns like these taken into account,” and the answer is “basically yes,” although not to the level of detail or precision that would be ideal going forward. Some of the prompts we asked our volunteer researchers to consider included:
How are decisions made in this institution? Who has ultimate authority? Who has practical authority?
How and under what circumstances does this institution make changes to a) its overall priorities and b) the way that it operates? Is it possible to influence relevant subdivisions of this institution even if its overall leadership or culture is resistant to change?
Is this institution likely to become more or less important on the world stage in the next 10 years? What about the next 100? Please note any relevant contingencies, e.g. potential mergers or splits.
FYI, the full model is now posted in my response above to MathiasKB; it sounds like it might be helpful for you to take a look if you have further questions.
Continuing on:
As I mentioned in my response to weeatquince, this inverse relationship is already baked into the analysis in the sense that absent institution-specific evidence to the contrary, we assumed that a larger and more bureaucratic or socially-distant-from-EA organization would be harder to influence. I really want to emphasize that the list in the article is not just a list of the most important institutions, it is the list we came up with after we took considerations about tractability into account. Now, it is entirely possible that we underrated those concerns overall. Still, I suspect you may be overrating them—for example, just checking my LinkedIn I find that I have seven connections in common with the current Chief of Staff to the President of the United States...and not because I have been consciously cultivating those connections, but simply because our social and professional circles are not that distant from each other. And I’m just one person: when you combine all of the networks of everyone involved in EA and everyone connected to EIP and the improving institutional decision-making community more broadly, that’s a lot of potential network reach.
I’m not sure what you mean by “unit of analysis” in this context, could you give an example? In an ideal world, I think you’re right that the project would have benefited from more engagement with the types of folks you’re talking about. However, members of the project team did include a person working at one of the big tech companies on our list, another person working at a top consulting firm, another person who is at the World Bank, a couple of people who work for the UK government, etc. And one of our advisors chairs an external working group at the WHO. So we did have some of the kinds of access you’re talking about, which I think is a decent start given that we were putting all of this together essentially on a $25k grant.
Yeah, this is a totally fair observation, and I’ll confess that at one point I considered ditching the model entirely and just publishing the survey results. In the end, however, I think it proved really useful to us. I’m a big believer in the principle that high uncertainty need not preclude a quantitative approach (Doug Hubbard’s book How to Measure Anything is a really useful resource on this topic—see Luke Muehlhauser’s summary/review here). I personally got a lot out of fiddling with the numbers and learning how robust they were to different assumptions, and that helped give me the confidence to include it in our analysis. It’s not to say that I don’t expect changes to the topline takeaways once we get better information—I do expect them—but I’d be moderately surprised if they were so drastic that it makes this earlier analysis look completely silly in retrospect.
@IanDavidMoss, thanks for the reply. I would love if you could go a little deeper into what is an institution to you. How do you characterize it, and why is this nomenclature important? I just would like to go back to my apples to apples comparison question. My first instinct is that comparing Meta to Blackrock to the Bill and Melinda Gates foundation to the Office of the President of the USA to the CCP Central Commitee is going to create some false parallels and misunderstandings of degree of importance or possibility for change ( I will just call this power).
I would suspect that the amount of power of the President of the USA is orders of magnitude greater than the Bill and Melinda Gates foundation. So while they might be on a long list together, they are a bit like comparing our moon and the Sun. So we would have a magnitude issue.
In addition we would have a capabilites issue. The office of the President is much more powerful than Mark Zuckerberg, I would argue, but Meta can also do things that the President could only dream of. Facebook has been an incredible tool for spreading information, both for good and nefarious purposes. The US government could only wish for that ability to reach peoples’ brains.
These thoughts lead me to imagine what you final recommendations will look like, and I am not sure. I suspect you will discover that you end up making very specific suggestions for different insitutions. Other than a standard 80k be flexible and build up your career capital suggestion, I think it might be difficult to give thematic recommendations that are equally useful in all of the types of organizations you tackle here.
Hi Charlie, if you haven’t already read the post we wrote last year introducing the prioritization framework used in this article, I recommend you do so as it goes over many of these theoretical questions in depth. In that post we offered the following characterization of an institution:
As you can see, under this definition there is nothing particularly weird about grouping government and non-government organizations together; they are both formally organized groups of people delineated by legal entities. And even if you were to limit your analysis to, say, tech companies, you would still face the same issue of vastly divergent magnitudes and capabilities within that set of organizations and have to figure out a way to derive meaning from that. Basically, I don’t disagree at all with the observations you’re making, but my takeaway is “yes, and this is all the more reason why a holistic and cross-sector analysis is relevant and valuable,” not “well, I guess we shouldn’t bother because this is all too hard.”
This has always been the plan. I’ve believed and argued for a long time that while institutions have some common features and problems, identifying the most actionable and promising levers for change requires a highly tailored approach. And this type of work seems to me more neglected within EA than more general, intervention-level analysis (e.g., here and here). So I think we are on the same page.