I personally have no stake in defending Conjecture (In fact, I have some questions about the CoEm agenda) but I do think there are a couple of points that feel misleading or wrong to me in your critique.
1. Confidence (meta point): I do not understand where the confidence with which you write the post (or at least how I read it) comes from. I’ve never worked at Conjecture (and presumably you didn’t either) but even I can see that some of your critique is outdated or feels like a misrepresentation of their work to me (see below). For example, making recommendations such as “freezing the hiring of all junior people” or “alignment people should not join Conjecture” require an extremely high bar of evidence in my opinion. I think it is totally reasonable for people who believe in the CoEm agenda to join Conjecture and while Connor has a personality that might not be a great fit for everyone, I could totally imagine working with him productively. Furthermore, making a claim about how and when to hire usually requires a lot of context and depends on many factors, most of which an outsider probably can’t judge. Given that you state early on that you are an experienced member of the alignment community and your post suggests that you did rigorous research to back up these claims, I think people will put a lot of weight on this post and it does not feel like you use your power responsibly here. I can very well imagine a less experienced person who is currently looking for positions in the alignment space to go away from this post thinking “well, I shouldn’t apply to Conjecture then” and that feels unjustified to me.
2. Output so far: My understanding of Conjecture’s research agenda so far was roughly: “They started with Polytopes as a big project and published it eventually. On reflection, they were unhappy with the speed and quality of their work (as stated in their reflection post) and decided to change their research strategy. Every two weeks or so, they started a new research sprint in search of a really promising agenda. Then, they wrote up their results in a preliminary report and continued with another project if their findings weren’t sufficiently promising.” In most of their public posts, they stated, that these are preliminary findings and should be treated with caution, etc. Therefore, I think it’s unfair to say that most of their posts do not meet the bar of a conference publication because that wasn’t the intended goal. Furthermore, I think it’s actually really good that the alignment field is willing to break academic norms and publish preliminary findings. Usually, this makes it much easier to engage with and criticize work earlier and thus improves overall output quality. On a meta-level, I think it’s bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don’t find a lot until you hit. These kinds of critiques make it more likely that people follow small incremental research agendas and alignment just becomes academia 2.0. When you make a critique like that, at least acknowledge that hits-based research might be the right approach.
3. Your statements about the VCs seem unjustified to me. How do you know they are not aligned? How do you know they wouldn’t support Conjecture doing mostly safety work? How do you know what the VCs were promised in their private conversations with the Conjecture leadership team? Have you talked to the VCs or asked them for a statement? Of course, you’re free to speculate from the outside but my understanding is that Conjecture actually managed to choose fairly aligned investors who do understand the mission of solving catastrophic risks. I haven’t talked to the VCs either, but I’ve at least asked people who work(ed) at Conjecture.
In conclusion: 1. I think writing critiques is good but really hard without insider knowledge and context. 2. I think this piece will actually (partially) misinform a large part of the community. You can see this already in the comments where people without context say this is a good piece and thanking you for “all the insights” (update: I misunderstood a comment and don’t think my original phrasing applies anymore). 3. The EA/LW community seems to be very eager to value critiques highly and for good reason. But whenever people use critiques to spread (partially) misleading information, they should be called out. 4. That being said, I think your critique is partially warranted and things could have gone a lot better at Conjecture. It’s just important to distinguish between “could have gone a lot better” and “we recommend not to work for Conjecture” or adding some half-truths to the warranted critiques. 5. I think your post on Redwood was better but suffered from some of the same problems. Especially the fact that you criticize them for having not enough tangible output when following a hits-based agenda just seems counterproductive to me.
It seems to me you have two points on the content of this critique. The first point:
I think it’s bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don’t find a lot until you hit.
I’m pretty confused here. How exactly do you propose that funding decisions get made? If some random person says they are pursuing a hits-based approach to research, should EA funders be obligated to fund them?
Presumably you would want to say “the team will be good at hits-based research such that we can expect a future hit, for X, Y and Z reasons”. I think you should actually say those X, Y and Z reasons so that the authors of the critique can engage with them; I assume that the authors are implicitly endorsing a claim like “there aren’t any particularly strong reasons to expect Conjecture to do more impactful work in the future”.
The second point:
Your statements about the VCs seem unjustified to me. How do you know they are not aligned? [...] I haven’t talked to the VCs either, but I’ve at least asked people who work(ed) at Conjecture.
Hmm, it seems extremely reasonable to me to take as a baseline prior that the VCs are profit-motivated, and the authors explicitly say
We have heard credible complaints of this from their interactions with funders. One experienced technical AI safety researcher recalled Connor saying that he will tell investors that they are very interested in making products, whereas the predominant focus of the company is on AI safety.
The fact that people who work(ed) at Conjecture say otherwise means that (probably) someone is wrong, but I don’t see a strong reason to believe that it’s the OP who is wrong.
At the meta level you say:
I do not understand where the confidence with which you write the post (or at least how I read it) comes from.
And in your next comment:
I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.
But afaict, the only point where you actually disagree with a claim made in the OP (excluding recommendations) is in your assessment of VCs? (And in that case I feel very uncompelled by your argument.)
In what way has the OP failed to say true things? Where should they have had more uncertainty? What things did they present as facts which were actually feelings? What claim have they been confident about that they shouldn’t have been confident about?
(Perhaps you mean to say that the recommendations are overconfident. There I think I just disagree with you about the bar for evidence for making recommendations, including ones as strong as “alignment researchers shouldn’t work at organization X”. I’ve given recommendations like this to individual people who asked me for a recommendation in the past, on less evidence than collected in this post.)
Meta: maybe my comment on the critique reads stronger than intended (see comment with clarifications) and I do agree with some of the criticisms and some of the statements you made. I’ll reflect on where I should have phrased things differently and try to clarify below.
Hits-based research: Obviously results are one evaluation criterion for scientific research. However, especially for hits-based research, I think there are other factors that cannot be neglected. To give a concrete example, if I was asked whether I should give a unit under your supervision $10M in grant funding or not, I would obviously look back at your history of results but a lot of my judgment would be based on my belief in your ability to find meaningful research directions in the future. To a large extent, the funding would be a bet on you and the research process you introduce in a team and much less on previous results. Obviously, your prior research output is a result of your previous process but especially in early organizations this can diverge quite a bit. Therefore, I think it is fair to say that both a) the output of Conjecture so far has not been that impressive IMO and b) I think their updates to early results to iterate faster and look for more hits actually is positive evidence about their expected future output.
Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I’m aware of (not all of which are mentioned in the post and I’m not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like “Connor didn’t tell the VCs about the alignment plans or neglects them in conversation”. However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it’s clear that products are a part of their way to address alignment but to the best of my knowledge, every VC who invested was very aware about what their getting into. At this point, it’s really hard for me to judge because I think that a) on priors, VCs are profit-seeking, and b) different sources said different things some of which are mutually exclusive. I don’t have enough insight to confidently say who is right here. I’m mainly saying, the confidence of OP surprised me given my previous discussions.
On recommendations: I have also recommended people in private not to work at specific organizations. However, this was always conditional on their circumstances. For example, often people aren’t aware on what exactly different safety teams are working on, so conditional on their preferences they should probably not work for lab X. Secondly, I think there is a difference between you saying something like this in private, even if it is unconditional, vs in public. In public, the audience is much larger and has much less context, etc. So I feel like your burden of proof is much higher.
lmk if that makes my position and disagreements clearer.
On hits-based research: I certainly agree there are other factors to consider in making a funding decision. I’m just saying that you should talk about those directly instead of criticizing the OP for looking at whether their research was good or not.
(In your response to OP you talk about a positive case for the work on simulators, SVD, and sparse coding—that’s the sort of thing that I would want to see, so I’m glad to see that discussion starting.)
On VCs: Your position seems reasonable to me (though so does the OP’s position).
On recommendations: Fwiw I also make unconditional recommendations in private. I don’t think this is unusual, e.g. I think many people make unconditional recommendations not to go into academia (though I don’t).
I don’t really buy that the burden of proof should be much higher in public. Reversing the position, do you think the burden of proof should be very high for anyone to publicly recommend working at lab X? If not, what’s the difference between a recommendation to work at org X vs an anti-recommendation (i.e. recommendation not to work at org X)? I think the three main considerations I’d point to are:
(Pro-recommendations) It’s rare for people to do things (relative to not doing things), so we differentially want recommendations vs anti-recommendations, so that it is easier for orgs to start up and do things.
(Anti-recommendations) There are strong incentives to recommend working at org X (obviously org X itself will do this), but no incentives to make the opposite recommendation (and in fact usually anti-incentives). Similarly I expect that inaccuracies in the case for the not-working recommendation will be pointed out (by org X), whereas inaccuracies in the case for working will not be pointed out. So we differentially want to encourage the opposite recommendations in order to get both sides of the story by lowering our “burden of proof”.
(Pro-recommendations) Recommendations have a nice effect of getting people excited and positive about the work done by the community, which can make people more motivated, whereas the same is not true of anti-recommendations.
Overall I think point 2 feels most important, and so I end up thinking that the burden of proof on critiques / anti-recommendations should be lower than the burden of proof on recommendations—and the burden of proof on recommendations is approximately zero. (E.g. if someone wrote a public post recommending Conjecture without any concrete details of why—just something along the lines of “it’s a great place doing great work”—I don’t think anyone would say that they were using their power irresponsibly.)
I would actually prefer a higher burden of proof on recommendations, but given the status quo if I’m only allowed to affect the burden of proof on anti-recommendations I’d probably want it to go down to ~zero. Certainly I’d want it to be well below the level that this post meets.
Hmm, yeah. I actually think you changed my mind on the recommendations. My new position is something like: 1. There should not be a higher burden on anti-recommendations than pro-recommendations. 2. Both pro- and anti-recommendations should come with caveats and conditionals whenever they make a difference to the target audience. 3. I’m now more convinced that the anti-recommendation of OP was appropriate. 4. I’d probably still phrase it differently than they did but my overall belief went from “this was unjustified” to “they should have used different wording” which is a substantial change in position. 5. In general, the context in which you make a recommendation still matters. For example, if you make a public comment saying “I’d probably not recommend working for X” the severity feels different than “I collected a lot of evidence and wrote this entire post and now recommend against working for X”. But I guess that just changes the effect size and not really the content of the recommendation.
We appreciate your detailed reply outlining your concerns with the post.
Our understanding is that your key concern is that we are judging Conjecture based on their current output, whereas since they are pursuing a hits-based strategy we should expect in the median case for them to not have impressive output. In general, we are excited by hits-based approaches, but we echo Rohin’s point: how are we meant to evaluate organizations if not by their output? It seems healthy to give promising researchers sufficient runway to explore, but $10 million dollars and a team of twenty seems on the higher end of what we would want to see supported purely on the basis of speculation. What would you suggest as the threshold where we should start to expect to see results from organizations?
We are unsure where else you disagree with our evaluation of their output. If we understand correctly, you agree that their existing output has not been that impressive, but think that it is positive they were willing to share preliminary findings and that we have too high a bar for evaluating such output. We’ve generally not found their preliminary findings to significantly update our views, whereas we would for example be excited by rigorous negative results that save future researchers from going down dead-ends. However, if you’ve found engaging with their output to be useful to your research then we’d certainly take that as a positive update.
Your second key concern is that we provide limited evidence for our claims regarding the VCs investing in Conjecture. Unfortunately for confidentiality reasons we are limited in what information we can disclose: it’s reasonable if you wish to consequently discount this view. As Rohin said, it is normal for VCs to be profit-seeking. We do not mean to imply these VCs are unusually bad for VCs, just that their primary focus will be the profitability of Conjecture, not safety impact. For example, Nat Friedman has expressed skepticism of safety (e.g. this Tweet) and is a strong open-source advocate, which seems at odds with Conjecture’s info-hazard policy.
We have heard from multiple sources that Conjecture has pitched VCs on a significantly more product-focused vision than they are pitching EAs. These sources have either spoken directly to VCs, or have spoken to Conjecture leadership who were part of negotiation with VCs. Given this, we are fairly confident on the point that Conjecture is representing themselves differently to separate groups.
We believe your third key concern is our recommendations are over-confident. We agree there is some uncertainty, but think it is important to make actionable recommendations, and based on the information we have our sincerely held belief is that most individuals should not work at Conjecture. We would certainly encourage individuals to consider alternative perspectives (including expressed in this comment) and to ultimately make up their own mind rather than deferring, especially to an anonymous group of individuals!
Separately, I think we might consider the opportunity cost of working at Conjecture higher than you. In particular, we’d generally evaluate skill-building routes fairly highly: for example, being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company. These are generally close to capabilities-neutral, and can make individuals vastly more productive. Given the limited information on CogEm it’s hard to assess whether it will or won’t work, but we think there’s ample evidence that there are better places to develop skills than Conjecture.
We wholeheartedly agree that it is important to maintain high epistemic standards during the critique. We have tried hard to differentiate between well-established facts, our observations from sources, and our opinion formed from those. For example, the About Conjecture section focuses on facts; the Criticisms and Suggestions section includes our observations and opinions; and Our Views on Conjecture are more strongly focused on our opinions. We’d welcome feedback on any areas where you feel we over-claimed.
Meta: Thanks for taking the time to respond. I think your questions are in good faith and address my concerns, I do not understand why the comment is downvoted so much by other people.
1. Obviously output is a relevant factor to judge an organization among others. However, especially in hits-based approaches, the ultimate thing we want to judge is the process that generates the outputs to make an estimate about the chance of finding a hit. For example, a cynic might say “what has ARC-theory achieve so far? They wrote some nice framings of the problem, e.g. with ELK and heuristic arguments, but what have they ACtUaLLy achieved?” To which my answer would be, I believe in them because I think the process that they are following makes sense and there is a chance that they would find a really big-if-true result in the future. In the limit, process and results converge but especially early on they might diverge. And I personally think that Conjecture did respond reasonably to their early results by iterating faster and looking for hits. 2. I actually think their output is better than you make it look. The entire simulators framing made a huge difference for lots of people and writing up things that are already “known” among a handful of LLM experts is still an important contribution, though I would argue most LLM experts did not think about the details as much as Janus did. I also think that their preliminary research outputs are pretty valuable. The stuff on SVDs and sparse coding actually influenced a number of independent researchers I know (so much that they changed their research direction to that) and I thus think it was a valuable contribution. I’d still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks. 3. (copied from response to Rohin): Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I’m aware of (not all of which are mentioned in the post and I’m not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like “Connor didn’t tell the VCs about the alignment plans or neglects them in conversation”. However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it’s clear that products are a part of their way to address alignment but to the best of my knowledge, every VC who invested was very aware about what their getting into. At this point, it’s really hard for me to judge because I think that a) on priors, VCs are profit-seeking, and b) different sources said different things some of which are mutually exclusive. I don’t have enough insight to confidently say who is right here. I’m mainly saying, the confidence of you surprised me given my previous discussions with staff. 4. Regarding confidence: For example, I think saying “We think there are better places to work at than Conjecture” would feel much more appropriate than “we advice against...” Maybe that’s just me. I just felt like many statements are presented with a lot of confidence given the amount of insight you seem to have and I would have wanted them to be a bit more hedged and less confident. 5. Sure, for many people other opportunities might be a better fit. But I’m not sure I would e.g. support the statement that a general ML engineer would learn more in general industry than with Conjecture. I also don’t know a lot about CoEm but that would lead me to make weaker statements than suggesting against it.
Thanks for engaging with my arguments. I personally think many of your criticisms hit relevant points and I think a more hedged and less confident version of your post would have actually had more impact on me if I were still looking for a job. As it is currently written, it loses some persuasion on me because I feel like your making too broad unqualified statements which intuitively made me a bit skeptical of your true intentions. Most of me thinks that you’re trying to point out important criticism but there is a nagging feeling that it is a hit piece. Intuitively, I’m very averse against everything that looks like a click-bait hit piece by a journalist with a clear agenda. I’m not saying you should only consider me as your audience, I just want to describe the impression I got from the piece.
We appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.
1) We agree it’s worth taking into account aspects of an organization other than their output. Part of our skepticism towards Conjecture – and we should have made this more explicit in our original post (and will be updating it) – is the limited research track record of their staff, including their leadership. By contrast, even if we accept for the sake of argument that ARC has produced limited output, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to starting work at ARC. We’re not aware of any equally significant advances from Connor or other key staff members at Conjecture; we’d be interested to hear if you have examples of their pre-Conjecture output you find impressive.
We’re not particularly impressed by Conjecture’s process, although it’s possible we’d change our mind if we knew more about it. Maintaining high velocity in research is certainly a useful component, but hardly sufficient. The Builder/Breaker method proposed by ARC feels closer to a complete methodology. But this doesn’t feel like the crux for us: if Conjecture copied ARC’s process entirely, we’d still be much more excited about ARC (per-capita). Research productivity is a product of a large number of factors, and explicit process is an important but far from decisive one.
In terms of the explicit comparison with ARC, we would like to note that ARC Theory’s team size is an order of magnitude smaller than Conjecture. Based on ARC’s recent hiring post, our understanding is the theory team consists of just three individuals: Paul Christiano, Mark Xu and Jacob Hilton. If ARC had a team ten times larger and had spent close to $10 mn, then we would indeed be disappointed if there were not more concrete wins.
2) Thanks for the concrete examples, this really helps tease apart our disagreement.
We are overall glad that the Simulators post was written. Our view is that it could have been much stronger had it been clearer which claims were empirically supported versus hypotheses. Continuing the comparison with ARC, we found ELK to be substantially clearer and a deeper insight. Admittedly ELK is one of the outputs people in the TAIS community are most excited by so this is a high bar.
The stuff on SVDs and sparse coding [...] was a valuable contribution. I’d still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.
This sounds similar to our internal evaluation. We’re a bit confused by why “3 people in two weeks” is the relevant reference class. We’d argue the costs of Conjecture’s “misses” need to be accounted for, not just their “hits”. Redwood’s team size and budget are comparable to that of Conjecture, so if you think that causal scrubbing is more impressive than Conjecture’s other outputs, then it sounds like you agree with us that Redwood was more impressive than Conjecture (unless you think the Simulator’s post is head and shoulders above Redwood’s other output)?
Thanks for sharing the data point this influenced independent researchers. That’s useful to know, and updates us positively. Are you excited by those independent researchers’ new directions? Is there any output from those researchers you’d suggest we review?
3) We remain confident in our sources regarding Conecture’s discussion with VCs, although it’s certainly conceivable that Conjecture was more open with some VCs than others. To clarify, we are not claiming that Connor or others at Conjecture did not mention anything about their alignment plans or interest in x-risk to VCs (indeed, this would be a barely tenable position for them given their public discussion of these plans), simply that their pitch gave the impression that Conjecture was primarily focused on developing products. It’s reasonable for you to be skeptical of this if your sources at Conjecture disagree; we would be interested to know how close to the negotiations those staff were, although understand this may not be something you can share.
4) We think your point is reasonable. We plan to reflect this recommendation and will reply here when we have an update.
5) This certainly depends on what “general industry” refers to: a research engineer at Conjecture might well be better for ML skill-building than, say, being a software engineer at Walmart. But we would expect ML teams at top tech companies, or working with relevant professors, to be significantly better for skill-building. Generally we expect quality of mentorship to be one of the most important components of individuals developing as researchers and engineers. The Conjecture team is stretched thin as a result of rapid scaling, and had few experienced researchers or engineers on staff in the first place. By contrast, ML teams at top tech companies will typically have a much higher fraction of senior researchers and engineers, and professors at leading universities comprise some of the best researchers in the field. We’d be curious to hear your case for Conjecture as skill building; without that it’s hard to identify where our main disagreement lies.
I’ll only briefly reply because I feel like I’ve said most of what I wanted to say. 1) Mostly agree but that feels like part of the point I’m trying to make. Doing good research is really hard, so when you don’t have a decade of past experience it seems more important how you react to early failures than whether you make them. 2) My understanding is that only about 8 people were involved with the public research outputs and not all of them were working on these outputs all the time. So the 1 OOM in contrast to ARC feels more like a 2x-4x. 3) Can’t share. 4) Thank you. Hope my comments helped. 5) I just asked a bunch of people who work(ed) at Conjecture and they said they expect the skill building to be better for a career in alignment than e.g. working with a non-alignment team at Google.
Some clarifications on the comment: 1. I strongly endorse critique of organisations in general and especially within the EA space. I think it’s good that we as a community have the norm to embrace critiques. 2. I personally have my criticisms for Conjecture and my comment should not be seen as “everything’s great at Conjecture, nothing to see here!”. In fact, my main criticism of leadership style and CoEm not being the most effective thing they could do, are also represented prominently in this post. 3. I’d also be fine with the authors of this post saying something like “I have a strong feeling that something is fishy at Conjecture, here are the reasons for this feeling”. Or they could also clearly state which things are known and which things are mostly intuitions. 4. However, I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process. 5. My main problem with the post is that they make a list of specific claim with high confidence and I think that is not warranted given the evidence I’m aware of. That’s all.
You can see this already in the comments where people without context say this is a good piece and thanking you for “all the insights”.
FWIW I Control-F’d for “all the insights” and did not see any other hit on this page other than your comment.
EDIT 2023/06/14: Hmm, so I’ve since read all the comments on this post on both the EAF and LW[1], and I don’t think your sentence was an accurate paraphrase for any of the comments on this post?
For context, the most positive comment on this post is probably mine, and astute readers might note that my comment was process-oriented rather than talking about quantity of insights.
The comment I was referring to was in fact yours. After re-reading your comment and my statement, I think I misunderstood your comment originally. I thought it was not only praising the process but also the content itself. Sorry about that.
I updated my comment accordingly to indicate my misunderstanding.
The “all the insights” was not meant as a literal quote but more as a cynical way of saying it. In hindsight, this is obviously bound to be misunderstood and I should have phrased it differently.
Thanks for the correction! I also appreciate the polite engagement.
As a quick clarification, I’m not a stickler for exact quotes (though e.g. the APA is), but I do think it’s important for paraphrases to be accurate.
I’ll also endeavor to be make my own comments harder to misinterpret going forwards, to minimize future misunderstandings.
To be clear, I also appreciate the content of this post, but more because it either brought new information to my attention, or summarized information I was aware of in the same place. (rather than because it offered particularly novel insights).
Could you say a bit more about your statement that “making recommendations such as . . . . ‘alignment people should not join Conjecture’ require an extremely high bar of evidence in my opinion”?
The poster stated that there are “more impactful places to work” and listed a number of them—shouldn’t they say that if they believe it is more likely than not true? They have stated their reasons; the reader can decide whether they are well-supported. The statement that Conjecture seems “relatively weak for skill building” seems supported by reasonable grounds. And the author’s characterization of likelihood that Conjecture is net-negative is merely “plausible.” That low bar seems hard to argue with; the base rate of for-profit companies without any known special governance safeguards acting like for-profit companies usually do (i.e., in a profit-maximinzing manner) is not low.
Maybe we’re getting too much into the semantics here but I would have found a headline of “we believe there are better places to work at” much more appropriate for the kind of statement they are making. 1. A blanket unconditional statement like this seems unjustified. Like I said before, if you believe in CoEm, Conjecture probably is the right place to work for. 2. Where does the “relatively weak for skill building” come from? A lot of their research isn’t public, a lot of engineering skills are not very tangible from the outside, etc. Why didn’t they just ask the many EA-aligned employees at Conjecture about what they thought of the skills they learned? Seems like such an easy way to correct for a potential mischaracterization. 3. Almost all AI alignment organizations are “plausibly” net negative. What if ARC evals underestimates their gain-of-function research? What if Redwood’s advances in interpretability lead to massive capability gains? What if CAIS’s efforts with the letter had backfired and rallied everyone against AI safety? This bar is basically meaningless without expected values.
Does that clarify where my skepticism comes from? Also, once again, my arguments should not be seen as a recommendation for Conjecture. I do agree with many of the criticisms made in the post.
I personally have no stake in defending Conjecture (In fact, I have some questions about the CoEm agenda) but I do think there are a couple of points that feel misleading or wrong to me in your critique.
1. Confidence (meta point): I do not understand where the confidence with which you write the post (or at least how I read it) comes from. I’ve never worked at Conjecture (and presumably you didn’t either) but even I can see that some of your critique is outdated or feels like a misrepresentation of their work to me (see below). For example, making recommendations such as “freezing the hiring of all junior people” or “alignment people should not join Conjecture” require an extremely high bar of evidence in my opinion. I think it is totally reasonable for people who believe in the CoEm agenda to join Conjecture and while Connor has a personality that might not be a great fit for everyone, I could totally imagine working with him productively. Furthermore, making a claim about how and when to hire usually requires a lot of context and depends on many factors, most of which an outsider probably can’t judge.
Given that you state early on that you are an experienced member of the alignment community and your post suggests that you did rigorous research to back up these claims, I think people will put a lot of weight on this post and it does not feel like you use your power responsibly here.
I can very well imagine a less experienced person who is currently looking for positions in the alignment space to go away from this post thinking “well, I shouldn’t apply to Conjecture then” and that feels unjustified to me.
2. Output so far: My understanding of Conjecture’s research agenda so far was roughly: “They started with Polytopes as a big project and published it eventually. On reflection, they were unhappy with the speed and quality of their work (as stated in their reflection post) and decided to change their research strategy. Every two weeks or so, they started a new research sprint in search of a really promising agenda. Then, they wrote up their results in a preliminary report and continued with another project if their findings weren’t sufficiently promising.” In most of their public posts, they stated, that these are preliminary findings and should be treated with caution, etc. Therefore, I think it’s unfair to say that most of their posts do not meet the bar of a conference publication because that wasn’t the intended goal.
Furthermore, I think it’s actually really good that the alignment field is willing to break academic norms and publish preliminary findings. Usually, this makes it much easier to engage with and criticize work earlier and thus improves overall output quality.
On a meta-level, I think it’s bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don’t find a lot until you hit. These kinds of critiques make it more likely that people follow small incremental research agendas and alignment just becomes academia 2.0. When you make a critique like that, at least acknowledge that hits-based research might be the right approach.
3. Your statements about the VCs seem unjustified to me. How do you know they are not aligned? How do you know they wouldn’t support Conjecture doing mostly safety work? How do you know what the VCs were promised in their private conversations with the Conjecture leadership team? Have you talked to the VCs or asked them for a statement?
Of course, you’re free to speculate from the outside but my understanding is that Conjecture actually managed to choose fairly aligned investors who do understand the mission of solving catastrophic risks. I haven’t talked to the VCs either, but I’ve at least asked people who work(ed) at Conjecture.
In conclusion:
1. I think writing critiques is good but really hard without insider knowledge and context.
2. I think this piece will actually (partially) misinform a large part of the community. You can see this already in the comments where people
without context say this is a good piece and thanking you for “all the insights”(update: I misunderstood a comment and don’t think my original phrasing applies anymore).3. The EA/LW community seems to be very eager to value critiques highly and for good reason. But whenever people use critiques to spread (partially) misleading information, they should be called out.
4. That being said, I think your critique is partially warranted and things could have gone a lot better at Conjecture. It’s just important to distinguish between “could have gone a lot better” and “we recommend not to work for Conjecture” or adding some half-truths to the warranted critiques.
5. I think your post on Redwood was better but suffered from some of the same problems. Especially the fact that you criticize them for having not enough tangible output when following a hits-based agenda just seems counterproductive to me.
I’m not very compelled by this response.
It seems to me you have two points on the content of this critique. The first point:
I’m pretty confused here. How exactly do you propose that funding decisions get made? If some random person says they are pursuing a hits-based approach to research, should EA funders be obligated to fund them?
Presumably you would want to say “the team will be good at hits-based research such that we can expect a future hit, for X, Y and Z reasons”. I think you should actually say those X, Y and Z reasons so that the authors of the critique can engage with them; I assume that the authors are implicitly endorsing a claim like “there aren’t any particularly strong reasons to expect Conjecture to do more impactful work in the future”.
The second point:
Hmm, it seems extremely reasonable to me to take as a baseline prior that the VCs are profit-motivated, and the authors explicitly say
The fact that people who work(ed) at Conjecture say otherwise means that (probably) someone is wrong, but I don’t see a strong reason to believe that it’s the OP who is wrong.
At the meta level you say:
And in your next comment:
But afaict, the only point where you actually disagree with a claim made in the OP (excluding recommendations) is in your assessment of VCs? (And in that case I feel very uncompelled by your argument.)
In what way has the OP failed to say true things? Where should they have had more uncertainty? What things did they present as facts which were actually feelings? What claim have they been confident about that they shouldn’t have been confident about?
(Perhaps you mean to say that the recommendations are overconfident. There I think I just disagree with you about the bar for evidence for making recommendations, including ones as strong as “alignment researchers shouldn’t work at organization X”. I’ve given recommendations like this to individual people who asked me for a recommendation in the past, on less evidence than collected in this post.)
Good comment, consider cross-posting to LW?
Meta: maybe my comment on the critique reads stronger than intended (see comment with clarifications) and I do agree with some of the criticisms and some of the statements you made. I’ll reflect on where I should have phrased things differently and try to clarify below.
Hits-based research: Obviously results are one evaluation criterion for scientific research. However, especially for hits-based research, I think there are other factors that cannot be neglected. To give a concrete example, if I was asked whether I should give a unit under your supervision $10M in grant funding or not, I would obviously look back at your history of results but a lot of my judgment would be based on my belief in your ability to find meaningful research directions in the future. To a large extent, the funding would be a bet on you and the research process you introduce in a team and much less on previous results. Obviously, your prior research output is a result of your previous process but especially in early organizations this can diverge quite a bit. Therefore, I think it is fair to say that both a) the output of Conjecture so far has not been that impressive IMO and b) I think their updates to early results to iterate faster and look for more hits actually is positive evidence about their expected future output.
Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I’m aware of (not all of which are mentioned in the post and I’m not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like “Connor didn’t tell the VCs about the alignment plans or neglects them in conversation”. However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it’s clear that products are a part of their way to address alignment but to the best of my knowledge, every VC who invested was very aware about what their getting into. At this point, it’s really hard for me to judge because I think that a) on priors, VCs are profit-seeking, and b) different sources said different things some of which are mutually exclusive. I don’t have enough insight to confidently say who is right here. I’m mainly saying, the confidence of OP surprised me given my previous discussions.
On recommendations: I have also recommended people in private not to work at specific organizations. However, this was always conditional on their circumstances. For example, often people aren’t aware on what exactly different safety teams are working on, so conditional on their preferences they should probably not work for lab X. Secondly, I think there is a difference between you saying something like this in private, even if it is unconditional, vs in public. In public, the audience is much larger and has much less context, etc. So I feel like your burden of proof is much higher.
lmk if that makes my position and disagreements clearer.
On hits-based research: I certainly agree there are other factors to consider in making a funding decision. I’m just saying that you should talk about those directly instead of criticizing the OP for looking at whether their research was good or not.
(In your response to OP you talk about a positive case for the work on simulators, SVD, and sparse coding—that’s the sort of thing that I would want to see, so I’m glad to see that discussion starting.)
On VCs: Your position seems reasonable to me (though so does the OP’s position).
On recommendations: Fwiw I also make unconditional recommendations in private. I don’t think this is unusual, e.g. I think many people make unconditional recommendations not to go into academia (though I don’t).
I don’t really buy that the burden of proof should be much higher in public. Reversing the position, do you think the burden of proof should be very high for anyone to publicly recommend working at lab X? If not, what’s the difference between a recommendation to work at org X vs an anti-recommendation (i.e. recommendation not to work at org X)? I think the three main considerations I’d point to are:
(Pro-recommendations) It’s rare for people to do things (relative to not doing things), so we differentially want recommendations vs anti-recommendations, so that it is easier for orgs to start up and do things.
(Anti-recommendations) There are strong incentives to recommend working at org X (obviously org X itself will do this), but no incentives to make the opposite recommendation (and in fact usually anti-incentives). Similarly I expect that inaccuracies in the case for the not-working recommendation will be pointed out (by org X), whereas inaccuracies in the case for working will not be pointed out. So we differentially want to encourage the opposite recommendations in order to get both sides of the story by lowering our “burden of proof”.
(Pro-recommendations) Recommendations have a nice effect of getting people excited and positive about the work done by the community, which can make people more motivated, whereas the same is not true of anti-recommendations.
Overall I think point 2 feels most important, and so I end up thinking that the burden of proof on critiques / anti-recommendations should be lower than the burden of proof on recommendations—and the burden of proof on recommendations is approximately zero. (E.g. if someone wrote a public post recommending Conjecture without any concrete details of why—just something along the lines of “it’s a great place doing great work”—I don’t think anyone would say that they were using their power irresponsibly.)
I would actually prefer a higher burden of proof on recommendations, but given the status quo if I’m only allowed to affect the burden of proof on anti-recommendations I’d probably want it to go down to ~zero. Certainly I’d want it to be well below the level that this post meets.
Hmm, yeah. I actually think you changed my mind on the recommendations. My new position is something like:
1. There should not be a higher burden on anti-recommendations than pro-recommendations.
2. Both pro- and anti-recommendations should come with caveats and conditionals whenever they make a difference to the target audience.
3. I’m now more convinced that the anti-recommendation of OP was appropriate.
4. I’d probably still phrase it differently than they did but my overall belief went from “this was unjustified” to “they should have used different wording” which is a substantial change in position.
5. In general, the context in which you make a recommendation still matters. For example, if you make a public comment saying “I’d probably not recommend working for X” the severity feels different than “I collected a lot of evidence and wrote this entire post and now recommend against working for X”. But I guess that just changes the effect size and not really the content of the recommendation.
:) I’m glad we got to agreement!
(Or at least significantly closer, I’m sure there are still some minor differences.)
We appreciate your detailed reply outlining your concerns with the post.
Our understanding is that your key concern is that we are judging Conjecture based on their current output, whereas since they are pursuing a hits-based strategy we should expect in the median case for them to not have impressive output. In general, we are excited by hits-based approaches, but we echo Rohin’s point: how are we meant to evaluate organizations if not by their output? It seems healthy to give promising researchers sufficient runway to explore, but $10 million dollars and a team of twenty seems on the higher end of what we would want to see supported purely on the basis of speculation. What would you suggest as the threshold where we should start to expect to see results from organizations?
We are unsure where else you disagree with our evaluation of their output. If we understand correctly, you agree that their existing output has not been that impressive, but think that it is positive they were willing to share preliminary findings and that we have too high a bar for evaluating such output. We’ve generally not found their preliminary findings to significantly update our views, whereas we would for example be excited by rigorous negative results that save future researchers from going down dead-ends. However, if you’ve found engaging with their output to be useful to your research then we’d certainly take that as a positive update.
Your second key concern is that we provide limited evidence for our claims regarding the VCs investing in Conjecture. Unfortunately for confidentiality reasons we are limited in what information we can disclose: it’s reasonable if you wish to consequently discount this view. As Rohin said, it is normal for VCs to be profit-seeking. We do not mean to imply these VCs are unusually bad for VCs, just that their primary focus will be the profitability of Conjecture, not safety impact. For example, Nat Friedman has expressed skepticism of safety (e.g. this Tweet) and is a strong open-source advocate, which seems at odds with Conjecture’s info-hazard policy.
We have heard from multiple sources that Conjecture has pitched VCs on a significantly more product-focused vision than they are pitching EAs. These sources have either spoken directly to VCs, or have spoken to Conjecture leadership who were part of negotiation with VCs. Given this, we are fairly confident on the point that Conjecture is representing themselves differently to separate groups.
We believe your third key concern is our recommendations are over-confident. We agree there is some uncertainty, but think it is important to make actionable recommendations, and based on the information we have our sincerely held belief is that most individuals should not work at Conjecture. We would certainly encourage individuals to consider alternative perspectives (including expressed in this comment) and to ultimately make up their own mind rather than deferring, especially to an anonymous group of individuals!
Separately, I think we might consider the opportunity cost of working at Conjecture higher than you. In particular, we’d generally evaluate skill-building routes fairly highly: for example, being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company. These are generally close to capabilities-neutral, and can make individuals vastly more productive. Given the limited information on CogEm it’s hard to assess whether it will or won’t work, but we think there’s ample evidence that there are better places to develop skills than Conjecture.
We wholeheartedly agree that it is important to maintain high epistemic standards during the critique. We have tried hard to differentiate between well-established facts, our observations from sources, and our opinion formed from those. For example, the About Conjecture section focuses on facts; the Criticisms and Suggestions section includes our observations and opinions; and Our Views on Conjecture are more strongly focused on our opinions. We’d welcome feedback on any areas where you feel we over-claimed.
Meta: Thanks for taking the time to respond. I think your questions are in good faith and address my concerns, I do not understand why the comment is downvoted so much by other people.
1. Obviously output is a relevant factor to judge an organization among others. However, especially in hits-based approaches, the ultimate thing we want to judge is the process that generates the outputs to make an estimate about the chance of finding a hit. For example, a cynic might say “what has ARC-theory achieve so far? They wrote some nice framings of the problem, e.g. with ELK and heuristic arguments, but what have they ACtUaLLy achieved?” To which my answer would be, I believe in them because I think the process that they are following makes sense and there is a chance that they would find a really big-if-true result in the future. In the limit, process and results converge but especially early on they might diverge. And I personally think that Conjecture did respond reasonably to their early results by iterating faster and looking for hits.
2. I actually think their output is better than you make it look. The entire simulators framing made a huge difference for lots of people and writing up things that are already “known” among a handful of LLM experts is still an important contribution, though I would argue most LLM experts did not think about the details as much as Janus did. I also think that their preliminary research outputs are pretty valuable. The stuff on SVDs and sparse coding actually influenced a number of independent researchers I know (so much that they changed their research direction to that) and I thus think it was a valuable contribution. I’d still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks.
3. (copied from response to Rohin): Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I’m aware of (not all of which are mentioned in the post and I’m not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like “Connor didn’t tell the VCs about the alignment plans or neglects them in conversation”. However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it’s clear that products are a part of their way to address alignment but to the best of my knowledge, every VC who invested was very aware about what their getting into. At this point, it’s really hard for me to judge because I think that a) on priors, VCs are profit-seeking, and b) different sources said different things some of which are mutually exclusive. I don’t have enough insight to confidently say who is right here. I’m mainly saying, the confidence of you surprised me given my previous discussions with staff.
4. Regarding confidence: For example, I think saying “We think there are better places to work at than Conjecture” would feel much more appropriate than “we advice against...” Maybe that’s just me. I just felt like many statements are presented with a lot of confidence given the amount of insight you seem to have and I would have wanted them to be a bit more hedged and less confident.
5. Sure, for many people other opportunities might be a better fit. But I’m not sure I would e.g. support the statement that a general ML engineer would learn more in general industry than with Conjecture. I also don’t know a lot about CoEm but that would lead me to make weaker statements than suggesting against it.
Thanks for engaging with my arguments. I personally think many of your criticisms hit relevant points and I think a more hedged and less confident version of your post would have actually had more impact on me if I were still looking for a job. As it is currently written, it loses some persuasion on me because I feel like your making too broad unqualified statements which intuitively made me a bit skeptical of your true intentions. Most of me thinks that you’re trying to point out important criticism but there is a nagging feeling that it is a hit piece. Intuitively, I’m very averse against everything that looks like a click-bait hit piece by a journalist with a clear agenda. I’m not saying you should only consider me as your audience, I just want to describe the impression I got from the piece.
We appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.
1) We agree it’s worth taking into account aspects of an organization other than their output. Part of our skepticism towards Conjecture – and we should have made this more explicit in our original post (and will be updating it) – is the limited research track record of their staff, including their leadership. By contrast, even if we accept for the sake of argument that ARC has produced limited output, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to starting work at ARC. We’re not aware of any equally significant advances from Connor or other key staff members at Conjecture; we’d be interested to hear if you have examples of their pre-Conjecture output you find impressive.
We’re not particularly impressed by Conjecture’s process, although it’s possible we’d change our mind if we knew more about it. Maintaining high velocity in research is certainly a useful component, but hardly sufficient. The Builder/Breaker method proposed by ARC feels closer to a complete methodology. But this doesn’t feel like the crux for us: if Conjecture copied ARC’s process entirely, we’d still be much more excited about ARC (per-capita). Research productivity is a product of a large number of factors, and explicit process is an important but far from decisive one.
In terms of the explicit comparison with ARC, we would like to note that ARC Theory’s team size is an order of magnitude smaller than Conjecture. Based on ARC’s recent hiring post, our understanding is the theory team consists of just three individuals: Paul Christiano, Mark Xu and Jacob Hilton. If ARC had a team ten times larger and had spent close to $10 mn, then we would indeed be disappointed if there were not more concrete wins.
2) Thanks for the concrete examples, this really helps tease apart our disagreement.
We are overall glad that the Simulators post was written. Our view is that it could have been much stronger had it been clearer which claims were empirically supported versus hypotheses. Continuing the comparison with ARC, we found ELK to be substantially clearer and a deeper insight. Admittedly ELK is one of the outputs people in the TAIS community are most excited by so this is a high bar.
This sounds similar to our internal evaluation. We’re a bit confused by why “3 people in two weeks” is the relevant reference class. We’d argue the costs of Conjecture’s “misses” need to be accounted for, not just their “hits”. Redwood’s team size and budget are comparable to that of Conjecture, so if you think that causal scrubbing is more impressive than Conjecture’s other outputs, then it sounds like you agree with us that Redwood was more impressive than Conjecture (unless you think the Simulator’s post is head and shoulders above Redwood’s other output)?
Thanks for sharing the data point this influenced independent researchers. That’s useful to know, and updates us positively. Are you excited by those independent researchers’ new directions? Is there any output from those researchers you’d suggest we review?
3) We remain confident in our sources regarding Conecture’s discussion with VCs, although it’s certainly conceivable that Conjecture was more open with some VCs than others. To clarify, we are not claiming that Connor or others at Conjecture did not mention anything about their alignment plans or interest in x-risk to VCs (indeed, this would be a barely tenable position for them given their public discussion of these plans), simply that their pitch gave the impression that Conjecture was primarily focused on developing products. It’s reasonable for you to be skeptical of this if your sources at Conjecture disagree; we would be interested to know how close to the negotiations those staff were, although understand this may not be something you can share.
4) We think your point is reasonable. We plan to reflect this recommendation and will reply here when we have an update.
5) This certainly depends on what “general industry” refers to: a research engineer at Conjecture might well be better for ML skill-building than, say, being a software engineer at Walmart. But we would expect ML teams at top tech companies, or working with relevant professors, to be significantly better for skill-building. Generally we expect quality of mentorship to be one of the most important components of individuals developing as researchers and engineers. The Conjecture team is stretched thin as a result of rapid scaling, and had few experienced researchers or engineers on staff in the first place. By contrast, ML teams at top tech companies will typically have a much higher fraction of senior researchers and engineers, and professors at leading universities comprise some of the best researchers in the field. We’d be curious to hear your case for Conjecture as skill building; without that it’s hard to identify where our main disagreement lies.
I’ll only briefly reply because I feel like I’ve said most of what I wanted to say.
1) Mostly agree but that feels like part of the point I’m trying to make. Doing good research is really hard, so when you don’t have a decade of past experience it seems more important how you react to early failures than whether you make them.
2) My understanding is that only about 8 people were involved with the public research outputs and not all of them were working on these outputs all the time. So the 1 OOM in contrast to ARC feels more like a 2x-4x.
3) Can’t share.
4) Thank you. Hope my comments helped.
5) I just asked a bunch of people who work(ed) at Conjecture and they said they expect the skill building to be better for a career in alignment than e.g. working with a non-alignment team at Google.
We’ve updated the recommendation about working at Conjecture.
Some clarifications on the comment:
1. I strongly endorse critique of organisations in general and especially within the EA space. I think it’s good that we as a community have the norm to embrace critiques.
2. I personally have my criticisms for Conjecture and my comment should not be seen as “everything’s great at Conjecture, nothing to see here!”. In fact, my main criticism of leadership style and CoEm not being the most effective thing they could do, are also represented prominently in this post.
3. I’d also be fine with the authors of this post saying something like “I have a strong feeling that something is fishy at Conjecture, here are the reasons for this feeling”. Or they could also clearly state which things are known and which things are mostly intuitions.
4. However, I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.
5. My main problem with the post is that they make a list of specific claim with high confidence and I think that is not warranted given the evidence I’m aware of. That’s all.
FWIW I Control-F’d for “all the insights” and did not see any other hit on this page other than your comment.
EDIT 2023/06/14: Hmm, so I’ve since read all the comments on this post on both the EAF and LW[1], and I don’t think your sentence was an accurate paraphrase for any of the comments on this post?
For context, the most positive comment on this post is probably mine, and astute readers might note that my comment was process-oriented rather than talking about quantity of insights.
(yes, I have extremely poor time management. Why do you ask?)
The comment I was referring to was in fact yours. After re-reading your comment and my statement, I think I misunderstood your comment originally. I thought it was not only praising the process but also the content itself. Sorry about that.
I updated my comment accordingly to indicate my misunderstanding.
The “all the insights” was not meant as a literal quote but more as a cynical way of saying it. In hindsight, this is obviously bound to be misunderstood and I should have phrased it differently.
Thanks for the correction! I also appreciate the polite engagement.
As a quick clarification, I’m not a stickler for exact quotes (though e.g. the APA is), but I do think it’s important for paraphrases to be accurate.
I’ll also endeavor to be make my own comments harder to misinterpret going forwards, to minimize future misunderstandings.
To be clear, I also appreciate the content of this post, but more because it either brought new information to my attention, or summarized information I was aware of in the same place. (rather than because it offered particularly novel insights).
Could you say a bit more about your statement that “making recommendations such as . . . . ‘alignment people should not join Conjecture’ require an extremely high bar of evidence in my opinion”?
The poster stated that there are “more impactful places to work” and listed a number of them—shouldn’t they say that if they believe it is more likely than not true? They have stated their reasons; the reader can decide whether they are well-supported. The statement that Conjecture seems “relatively weak for skill building” seems supported by reasonable grounds. And the author’s characterization of likelihood that Conjecture is net-negative is merely “plausible.” That low bar seems hard to argue with; the base rate of for-profit companies without any known special governance safeguards acting like for-profit companies usually do (i.e., in a profit-maximinzing manner) is not low.
Maybe we’re getting too much into the semantics here but I would have found a headline of “we believe there are better places to work at” much more appropriate for the kind of statement they are making.
1. A blanket unconditional statement like this seems unjustified. Like I said before, if you believe in CoEm, Conjecture probably is the right place to work for.
2. Where does the “relatively weak for skill building” come from? A lot of their research isn’t public, a lot of engineering skills are not very tangible from the outside, etc. Why didn’t they just ask the many EA-aligned employees at Conjecture about what they thought of the skills they learned? Seems like such an easy way to correct for a potential mischaracterization.
3. Almost all AI alignment organizations are “plausibly” net negative. What if ARC evals underestimates their gain-of-function research? What if Redwood’s advances in interpretability lead to massive capability gains? What if CAIS’s efforts with the letter had backfired and rallied everyone against AI safety? This bar is basically meaningless without expected values.
Does that clarify where my skepticism comes from? Also, once again, my arguments should not be seen as a recommendation for Conjecture. I do agree with many of the criticisms made in the post.