The short version of the argument is that excessive praise for ‘direct work’ has caused a lot of people who fail to secure direct work to feel un-valued and bounce off EA.
Interesting! Is there any data that supports this?
The short version of the argument is that excessive praise for ‘direct work’ has caused a lot of people who fail to secure direct work to feel un-valued and bounce off EA.
Interesting! Is there any data that supports this?
Re 2. I agree that this is a lot of work but it’s little given how much money goes into grants. Some of the predictions are also quite straightforward to resolve.
Well, glad to hear that they are using it.
I believe that an alternative could be funding a general direction, e.g., funding everything in AIS, but I don’t think that these approaches are exclusive.
Meta: I’m requesting feedback and gauging interest. I’m not a grantmaker.
You can use prediction markets to improve grantmaking. The assumption is that having accurate predictions about project outcomes benefits the grantmaking process.
Here’s how I imagine the protocol could work:
Someone proposes an idea for a project.
They apply for a grant and make specific, measurable predictions about the outcomes they aim to achieve.
Examples of grant proposals and predictions (taken from here):
Project: Funding a well-executed podcast featuring innovative thinking from a range of cause areas in effective altruism.
Prediction: The podcast will reach 10,000 unique listeners in its first 12 months and score an average rating of 4.5/5 across major platforms.
Project: Funding a very promising biology PhD student to attend a one-month program run by a prestigious US think tank.
Prediction: The student will publish two policy-relevant research briefs within 12 months of attending the program.
Project: A 12-month stipend and budget for an EA to develop programs increasing the positive impact of biomedical engineers and scientists.
Prediction: Three biomedical researchers involved in the program will identify or implement career changes aimed at improving global health outcomes.
Project: Stipends for 4 full-time-equivalent (FTE) employees and operational expenses for an independent research organization conducting EA cause prioritization research.
Prediction: Two new donors with a combined giving potential of $5M+ will use this organization’s recommendations to allocate funds.
A prediction market is created based on these proposed outcomes, conditional on the project receiving funding. Some of the potential grant money is staked to make people trade.
Obvious criticism is that:
Markets can be gamed, so the potential grantee shouldn’t be allowed to bet.
Exploratory projects and research can’t make predictions like this.
A lot of people need to participate in the market.
To learn more and apply, visit our website.
This link is broken
Thanks! I saw that post. It’s an excellent approach. I’m planning to do something similar, but less time-consuming and limited. The range of theories of change that are pursued in AIS is limited and can be broken down into:
Evals
Field-building
Governance
Research
Evals can be measured by quality and number of evals, relevance to ex-risks. It seems pretty straightforward to differentiate a bad eval org from a good eval org—engaging with major labs, having a lot of evals, and a relation to existential risks.
Field-building—having a lot of participants who do awesome things after the project.
Research—I argue that the number of citations is also a good proxy for the impact of a paper. It’s definitely easy to measure and is related to how much engagement a paper received. In the absence of any work done to bring the paper to the attention of key decision makers, it’s very related to the engagement.
I’m not sure how to think about governance.
Take this with a grain of salt.
EDIT: Also I think that engaging broader ML community with AI safety is extremely valuable and citations tells us how if an organization is good at that. Another thing that would be good to reivew is to ask about transparency of organizations, how thier estimate their own impact and so on—this space is really unexplored and this seems crazy to me. The amount of money that goes into AI safety is gigantic and it would be worth exploring what happens with it.
I’m working on a project to estimate the cost-effectiveness of AIS orgs, something like Animal Charity Evaluators does. This involves gathering data on metrics such as:
People impacted (e.g., scholars trained).
Research output (papers, citations).
Funding received and allocated.
Some organizations (e.g., MATS, AISC) share impact analyses, there’s no broad comparison. AI safety orgs operate on diverse theories of change, making standardized evaluation tricky—but I think rough estimates could help with prioritization.
I’m looking for:
Previous work
Collaborators
Feedback on the idea
If you have ideas for useful metrics or feedback on the approach, let me know!
I’ve always been impressed with Rethink Priorities’ work, but this post is underwhelming.
As I understand it, the post argues that we can’t treat LLMs as coherent persons. The author seems to think this idea is vaguely connected to the claim that LLMs are not experiencing pain when they say they do. I guess the reasoning goes something like this: If LLMs are not coherent personas, then we shouldn’t interpret statements like “I feel pain” as genuine indicators that they actually feel pain, because such statements are more akin to role-playing than honest representations of their internal states.
I think this makes sense but the way it’s argued for is not great.
1. The user is not interacting with a single dedicated system.
The argument here seems to be: If the user is not interacting with a single dedicated system, then the system shouldn’t be treated as a coherent person.
This is clearly incorrect. Imagine we had the ability to simulate a brain. You could run the same brain simulation across multiple systems. A more hypothetical scenario: you take a group of frozen, identical humans, connect them to a realistic VR simulation, and ensure their experiences are perfectly synchronized. From the user’s perspective, interacting with this setup would feel indistinguishable from interacting with a single coherent person. Furthermore, if the system is subjected to suffering, the suffering would multiply with each instance the experience is replayed. This shows that coherence doesn’t necessarily depend on being a “single” system.
2. An LLM model doesn’t clearly distinguish the text it generates from the text the user inputs.
Firstly, this claim isn’t accurate. If you provide an LLM with the transcript of a conversation, it can often identify which parts are its responses and which parts are user inputs. This is an empirically testable claim. Moreover, statements about how LLMs process text don’t necessarily negate the possibility of them being coherent personas. For instance, it’s conceivable that an LLM could function exactly as described and still be a coherent persona.
There is interesting connection between those techniques and “Trapped priors” and the whole take on human cognition as bayesian reasoning and biases as a strong prior. Why would those techniques work? (Assuming they work).
I guess some like “Try to speak truth” can make you consider a wide range of connected notions e.g. you say something like “Climate change is fake” and you start to consider “why would make it true?” Or you just feel (because of your prioir) that this is true and ignore any further considerations (in that case the technique doesn’t work).
Do you have any arguments for why this would be more important rather than working on evals of deceptive AI or evals of cybersecurity capabilities? Asking in general, I’m trying to figure out how one should think about prioritizing things like that.
This is an interesting article! I understand the main claim as follows:
There are a number of simple rationality techniques, such as “Don’t make irrelevant personal attacks,” that are both simpler and more effective than complex rationality techniques.
Irrationality regarding moral and political issues is often due to a failure to apply these simple techniques.
If there were a strong social norm towards applying these techniques, people would apply them more consistently.
Therefore, we should focus on creating a social norm that encourages the use of these simple techniques, rather than emphasizing complex rationality techniques, because (implicitly) we want more people to be rational about moral and political issues.
An additional claim is that we typically focus on the “fun” parts of rationality, like self-improvement, instead of the simple but important aspects because they are less enjoyable. For example, discipline and restraint are harder to practice than self-improvement.
I assume this extra claim refers to the rationality community or the EA community.
So, the main point is essentially that rationality is mundane and simple (though not easy!), and we shouldn’t try to make it more complex than it really is. This perspective is quite refreshing, and I’ve had some similar thoughts!
However, I’m concerned that, even though people might know about these techniques, the emotionally charged nature of political and moral topics can make it difficult to apply them. It’s not necessarily the other way around. Also, while I’m not sure if you would label these as complex or not, sometimes it takes time to figure out what you actually want in life, and this requires “complex” techniques.
I just want to flag that I’ve raised the issue of the inconsistencies in the use of discount rate (if by “the discount rate in the GBD data” you mean the 3% or 4% discount rate in the standard inputs table) in an email sent a few days ago to one of the CE employees. Unfortunately, we failed to have a productive discussion, as the conversation died quickly when CE stopped responding. Here is one of the emails I sent:
Hi [name],
I might be wrong but you are using 1.4% rate in the CEA but the value of life saved at various ages is copied from GiveWell standard inputs that uses 4% discount rate to calculate the value. Isn’t this an inconsistency?
Mikolaj
I might have been too directive when writing this post. I lack the organizational context and knowledge of how CEAs are used to say definitively that this should be changed. I ultimately agree that this is a small change that might not affect the decisions made, and it’s up to you to decide whether to account for it. However, some of the points you raised against updating this are incorrect.
I might have focused too much on the 10% reduction, while the real issue, as Elliot mentioned, is that you ignore two variables in the formula for DALYs averted:
Missing out on three 10% reductions in error X results in a difference of 0.1^3 = 27.1% which could be significant. I generally view organizations as growing through small iterative changes and optimization rather than big leaps.
My critique is only valid if you are trying to measure DALYs averted. If you choose to do something similar to GiveWell, which is more arbitrary, then it might not make sense to adjust for this anymore.
The three changes to the value of life saved come from different frameworks:
GiveWell values don’t represent DALYs averted but are mixed with other factors such as survey results.
HLI’s work is based on the assumption that death isn’t the worst possible state and that there is a baseline quality of life that must be met for a life to be worth living.
The change I’m suggesting is compatible with your current method of estimating the value of life saved. It doesn’t introduce any new assumptions; it simply makes some assumptions explicit. Unless you state something like, “We used those values initially but then detached them from their original formulas and now we will update them in another way,” my suggestion should fit within your existing framework.
EDIT:
I can’t say much about the GiveWell 1.5% rate, but I’ve heard it comes from the Rethink Priorities review, but it suggests 4.3% discount rate: can you direct me somewhere where I can read more about it?
I agree, this wouldn’t change much probably, but this is a change that applies to a lot of CEAs and is in some way a straightforward and safe change?
This seems false to me. I agree that earning to give should be highly rewarded and so on, but I don’t think that, for example, launching an effective giving organization requires an incredible amount of talent. There have been many launched recently, either by CE or local groups (I was part of the team that launched one in Denmark). Recently, EAIF said that they are not funding-constrained, and there are a lot of projects being funded on Manifund. It looks more like funders are looking for new projects to fund. So either most of the funders are wrong in their assessment and should just grant to existing opportunities, or there is still room for new projects.
If anything my experience was that the bar for direct work is way lower than I expected and part of reason why I thought that way was that there are comments like this.