I worked at OpenAI for three years, from 2021-2024 on the Alignment team, which eventually became the Superalignment team. I worked on scalable oversight, part of the team developing critiques as a technique for using language models to spot mistakes in other language models. I then worked to refine an idea from Nick Cammarata into a method for using language model to generate explanations for features in language models. I was then promoted to managing a team of 4 people which worked on trying to understand language model features in context, leading to the release of an open source “transformer debugger” tool.
I resigned from OpenAI on February 15, 2024.
William_S
I wonder what you would get if you offered a cash prize to whoever wrote the “best” criticism of EA, according some criteria such as the opinion of a panel of specific EAs, or online voting on a forum. Obviously, this has a large potential for selection effects, but it might produce something interesting (either in the winner, or in other submissions that don’t get selected because they are too good).
I would like to note (although I don’t quite know what to do with this information) that the proposed method of gathering feedback leaves out at least 3 billion people who don’t have internet access. In practice, it’s probably also limited to gathering information from countries/in languages with at least some EA presence already (and mainly English speaking). Now, from a “optimize spread of EA ideas” perspective, it might be reasonable to focus in wealthier countries to reach people with more leverage (ie. expected earnings), but there are reasons to pay attention to this:
1) It could be very useful to have a population of EAs with background/lived experience in developing countries, to aid in idea generation for new international development programs. 2) EA might end up not spreading very much to people living in countries like China/India, which will become more economically important in the future. 3) We might end up making a mistake on some philosophically important issue due to biases in the background of most people in the EA movement. (I don’t have a good example of what this looks like, but there might be, say, system 1 factors arising from the culture where you grow up that influence your position on issues of population ethics or something).
I also don’t know how to go about this on the object level, or whether it’s the best place for marginal EA investment right now. (I also think that EA orgs involved in international development will have access to more of these diverse perspectives, but the point I make is that they aren’t present in the meta-level discussions).
Object level suggestion for collecting diverse opinions (for a specific person to look through, to make it easier to see trends): have something like a google form where people can report characteristics of an attempt to bring up EA ideas to a person or audience, and report comments on how the ideas were received. (This thread is a Schelling Point now, but won’t remain so in the future)
When considering a controversial political issue, an EA should also think about whether there are positions to take that differ from those typically presented in the mainstream media. There might be alternatives that EA reasoning opens up that people traditionally avoid because they, for example, stick to deontological reasoning and believe that either an act is right or it is wrong in all cases, and that these restrictions should be codified into law.
For the object level example raised in the article, the traditional framing is “abortion should be legal” vs. “abortion should be illegal”. Other alternatives to this might be, for example, performing other social interventions aimed at reducing the number of abortions within a framework where abortion is legal (ie. increasing social support offered to single mothers, so that fewer people choose to have an abortion).
I think if you want people to think about the meta-level, you would be better off with a post that says “suppose you have an argument for abortion” or “suppose you believe this simple argument X for abortion is correct” (where X is obviously a strawman, and raised as a hypothetical), and asks “what ought you do based on assuming this belief is true”. There may be a less controversial topic to use in this case.
If you want to start an object level on abortion (which, if you believe this argument is true, it seems you ought to), it might be helpful to circulate the article you want to use to start the discussion to a few EAs with varying positions on the topic before posting for feedback, because it is on a topic likely to trigger political buttons.
While I don’t think I would actually write a whole post for this, I might have a couple quick ideas to throw in a comments section. I’d suggest explicitly asking for comments and half-formed ideas in the summary post, and see if it produces anything interesting.
As a consideration for, there may be behaviours in the founder-VC relationship that negatively impact the founders (comes up in http://paulgraham.com/fr.html), such as trying to hold off committing as long as possible. EA VCs could try to bypass these to improve odds of startup success.
As a consideration against, the Halo Effect might cloud judgement around odds of success for EA entrepreneurs from the point of EA investors.
Something in developing world entrepreneurship that gives you a good position to spot opportunities for/carry out other developing world entrepreneurship.
If this turns out to something people find useful, it might also be useful to have people who watch the wiki and provide feedback/advice on the proposed study designs, or who can help people who are less familiar with study design and statistics to produce something useful. This provides an additional service along with the preregistration, so it isn’t just an extra onerous task. (I’d be willing to do this if it seems useful).
I’m somewhat doubtful that this experiment registry will attract a lot of use, but +1 for setting it up to try it out.
I know someone who would be interested in looking through a list of organizations like this right now (hoping to find places to work).
A couple examples I’ve run across: DataWind (http://en.wikipedia.org/wiki/DataWind), which is now at a more mature stage. Went to a talk by one of the founders recently. They made a really cheap tablet and internet services that work over 2G, which opens up the market of large sections of India currently without internet access. I think they could end up being quite successful.
A early stage example is EyeCheck (http://www.eyechecksolutions.com/), started by a couple of engineers out of undergrad. They’re developing a tool to improve diagnosis of vision problems to increase efficiency of providing glasses (think they’re starting working with NGOs running vision camps).
I also have had negative experiences with career search stuff (more around making decisions). My suggestion, that I’m also going to try, is find someone else who you can help support you through the career search process, who you can talk over decisions with, get to look over applications, maybe help talk you through the time you spend feeling useless before applying. This could also help keep you from settling with an inferior job, if you have to justify it to someone else.
I would also suggest, from experience, to avoid committing to a job at a time when you feel really down about yourself—I’ve done that before, and it would have been better to just wait. At least try to wait a few days, talk to some people about it, etc.
(Also, there’s a facebook group for EAs to help each other with personal issues, and it’s the sort of place where you can post this stuff and get advice—messages are only visible to group members. Message me if you’re interested and not already in it, and I can add you)