I work primarily on AI Alignment. My main direction at the moment is to accelerate alignment work via language models and interpretability.
jacquesthibs
Quillette founder seems to be planning to write an article regarding EA’s impact on on tech:
“If anyone with insider knowledge wants to write about the impact of Effective Altruism in the technology industry please get in touch with me claire@quillette.com. We pay our writers and can protect authors’ anonymity if desired.”
It would probably be impactful if someone in the know provided a counterbalance to whoever will undoubtedly email her to disparage EA with half-truths/lies.
To share another perspective: As an independent alignment researcher, I also feel really conflicted. I could be making several multiples of my salary if my focus was to get a role on an alignment team at an AGI lab. My other option would be building startups trying to hit it big and providing more funding to what I think is needed.
Like, I could say, “well, I’m already working directly on something and taking a big pay-cut so I shouldn’t need to donate close to 10%”, but something about that doesn’t feel right… But then to counter-balance that, I’m constantly worried that I just won’t get funding anymore at some point and would be in need of money to pay for expenses during a transition.
I’ve also started working on a repo in order to make Community Notes more efficient by using LLMs.
Don’t forget that we train language models on the internet! The more truthful your dataset is, the more truthful the models will be! Let’s revamp the internet for truthfulness, and we’ll subsequently improve truthfulness in our AI systems!!
I shared a tweet about it here: https://x.com/JacquesThibs/status/1724492016254341208?s=20
Consider liking and retweeting it if you think this is impactful. I’d like it to get into the hands of the right people.
If you work at a social media website or YouTube (or know anyone who does), please read the text below:
Community Notes is one of the best features to come out on social media apps in a long time. The code is even open source. Why haven’t other social media websites picked it up yet? If they care about truth, this would be a considerable step forward beyond. Notes like “this video is funded by x nation” or “this video talks about health info; go here to learn more” messages are simply not good enough.
If you work at companies like YouTube or know someone who does, let’s figure out who we need to talk to to make it happen. Naïvely, you could spend a weekend DMing a bunch of employees (PMs, engineers) at various social media websites in order to persuade them that this is worth their time and probably the biggest impact they could have in their entire career.
If you have any connections, let me know. We can also set up a doc of messages to send in order to come up with a persuasive DM.
Attempt to explain why I think AI systems are not the same thing as a library card when it comes to bio-risk.
To focus on less of an extreme example, I’ll be ignoring the case where AI can create new, more powerful pathogens faster than we can create defences, though I think this is an important case (some people just don’t find it plausible because it relies on the assumption that AIs being able to create new knowledge).
I think AI Safety people should make more of an effort to walkthrough the threat model so I’ll give an initial quick first try:
1) Library. If I’m a terrorist and I want to build a bioweapon, I have to spend several months reading books at minimum to understand how it all works. I don’t have any experts on-hand to explain how to do it step-by-step. I have to figure out which books to read and in what sequence. I have to look up external sources to figure out where I can buy specific materials.
Then, I have to somehow find out how to to gain access to those materials (this is the most difficult part for each case). Once I gain access to the materials, I still need to figure out how to make things work as a total noob at creating bioweapons. I will fail. Even experts fail. So, it will take many tries to get it right, and even then, there are tricks of the trade I’ll likely be unaware of no matter which books I read. Either it’s not in a book or it’s incredibly hard to find so you’ll basically never find it.
All this while needing a high enough degree of intelligence and competence.
2) AI agent system. You pull up your computer and ask for a synthesized step-by-step plan on how to cause the most death or ways to cripple your enemy. Many agents search through books and the internet while also using latent knowledge about the subject. It tells you everything you truly need to know in a concise 4-page document.
Relevant theory, practical steps (laid out with images and videos on how to do it), what to buy and where/how to buy it, pre-empting any questions you may have, explaining the jargon in a way that is understandable to nearly anyone, can take actions on the web to automatically buy all the supplies you need, etc.
You can even share photos of the entire process to your AI as it continues to guide you through the creation of the weapon because it’s multi-modal.
You can basically outsource all cognition to the AI system, allowing you to be the lazy human you are (we all know that humans will take the path of least-resistance or abandon something altogether if there is enough friction).
That topic you always said you wanted to know more about but never got around to it? No worries, your AI system has lowered the bar sufficiently that the task doesn’t seem as daunting anymore and laziness won’t be in the way of you making progress.
Conclusion: a future AI system will have the power of efficiency (significantly faster) and capability (able to make more powerful weapons than any one person could do on their own). It has the interactivity that Google and libraries don’t have. It’s just not the same as information scattered in different sources.
I’m working on an ultimate doc on productivity I plan to share and make it easy, specifically for alignment researchers.
Let me know if you have any comments or suggestions as I work on it.
Roam Research link for easier time reading.
Google Docs link in case you want to leave comments there.
From what I understand, Amazon does not get a board seat for this investment. Figured that should be highlighted. Seems like Amazon just gets to use Anthropic’s models and maybe make back their investment later on. Am I understanding this correctly?
As part of the investment, Amazon will take a minority stake in Anthropic. Our corporate governance structure remains unchanged, with the Long Term Benefit Trust continuing to guide Anthropic in accordance with our Responsible Scaling Policy. As outlined in this policy, we will conduct pre-deployment tests of new models to help us manage the risks of increasingly capable AI systems.
I would, however, not downplay their talent density.
Fantastic news. Note: don’t forget to share it on LessWrong too.
Thanks for sharing. I think the above are examples of things people often don’t think of when trying new ways to be more productive. Instead, the default is trying out new productivity tools and systems (which might also help!). Environment and being in a flux period can totally change your behaviour in the long term; sometimes, it’s the only way to create lasting change.
When I first was looking into being veg^n, I became irritated by the inflated reviews at veg^n restaurants. It didn’t take me long to apply a veg^n tax; I started to assume the restaurant’s food was 1 star below what their average was. Made me more distrustful of veg^ns too.
I think using virtue ethics is the right call here, just be truthful.
Perfect, thanks!
Is someone planning on doing an overview post of all the AI Pause discussion? I’m guessing some people would appreciate it if someone took the time to make an unbiased synthesis of the posts and discussions.
Are you or any other EA lawyer still doing this?
Either way, I’m seeking advice to figure out how I can save money on taxes once I move to the UK (I’m from Canada) and receive funding for my independent AI Safety research. I’ll be going to the UK on a Youth Mobility visa. I’m wondering if it’s possible for me to setup something so that I can save tax on ‘business’ expenses (office space, laptop, monitor, etc.).
I’m happy to pay if someone can help with this (otherwise I will reach out to non-EA lawyers).
Would newer people find it valuable to have some kind of 80,000 hours career chatbot that had access to the career guide, podcast notes, EA forum posts, job postings, etc, and then answered career questions? I’m curious if it could be designed to be better than just a raw read of the career guide or at least a useful add-on to the career guide.
Potential features:
It could collect your conversation and convert most of it into an application for a (human) 1-on-1 meeting.
You could have a speech-to-text option to ramble all the things you’ve been thinking of.
???
If anyone from 80k is reading this, I’d be happy to build this as a paid project.
Would be great to have someone who is exceptional at convincing high net worth individuals to donate for specific causes. I’m sure some people in the AI Safety community would find that valuable given the large funding gap despite the exceptional amount of attention the field is receiving. I’m sure other cause areas would also find it valuable.
EDIT: I’ve gotten a few disagree-votes, which is totally fine! Though, I’m curious why some people disagree. If it‘s because they wouldn’t find this interesting, they don’t think it would be appropriate for the podcast, or…?
Love this idea, thanks for organizing this.
My current speculation as to what is happening at OpenAI
How do we know this wasn’t their best opportunity to strike if Sam was indeed not being totally honest with the board?
Let’s say the rumours are true, that Sam is building out external orgs (NVIDIA competitor and iPhone-like competitor) to escape the power of the board and potentially go against the charter. Would this ‘conflict of interest’ be enough? If you take that story forward, it sounds more and more like he was setting up AGI to be run by external companies, using OpenAI as a fundraising bargaining chip, and having a significant financial interest in plugging AGI into those outside orgs.
So, if we think about this strategically, how long should they wait as board members who are trying to uphold the charter?
On top of this, it seems (according to Sam) that OpenAI has made a significant transformer-level breakthrough recently, which implies a significant capability jump. Long-term reasoning? Basically, anything short of ‘coming up with novel insights in physics’ is on the table, given that Sam recently used that line as the line we need to cross to get to AGI.
So, it could be a mix of, Ilya thinking they have achieved AGI while Sam places a higher bar (internal communication disagreements) + the board not being alerted (maybe more than once) about what Sam is doing, e.g. fundraising for both OpenAI and the orgs he wants to connect AGI to + new board members who are more willing to let Sam and GDB do what they want being added soon (another rumour I’ve heard) + ???. Basically, perhaps they saw this as their final opportunity to have any veto on actions like this.
Here’s what I currently believe:
There is a GPT-5-like model that already exists. It could be GPT-4.5 or something else, but another significant capability jump. Potentially even a system that can coherently pursue goals for months, capable of continual learning, and effectively able to automate like 10% of the workforce (if they wanted to).
As of 5 PM, Sunday PT, the board is in a terrible position where they either stay on board and the company employees all move to a new company, or they leave the board and bring Sam back. If they leave, they need to say that Sam did nothing wrong and sweep everything under the rug (and then potentially face legal action for saying he did something wrong); otherwise, Sam won’t come back.
Sam is building companies externally; it is unclear if this goes against the charter. But he does now have a significant financial incentive to speed up AI development. Adam D’Angelo said that he would like to prevent OpenAI from becoming a big tech company as part of his time on the board because AGI was too important for humanity. They might have considered Sam’s action going in this direction.
A few people left the board in the past year. It’s possible that Sam and GDB planned to add new people (possibly even change current board members) to the board to dilute the voting power a bit or at least refill board seats. This meant that the current board had limited time until their voting power would become less important. They might have felt rushed.
The board is either not speaking publicly because 1) they can’t share information about GPT-5, 2) there is some legal reason that I don’t understand (more likely), or 3) they are incompetent (least likely by far IMO).
We will possibly never find out what happened, or it will become clearer by the month as new things come out (companies and models). However, it seems possible the board will never say or admit anything publicly at this point.
Lastly, we still don’t know why the board decided to fire Sam. It could be any of the reasons above, a mix or something we just don’t know about.
Other possible things:
Ilya was mad that they wouldn’t actually get enough compute for Superalignment as promised due to GPTs and other products using up all the GPUs.
Ilya is frustrated that Sam is focused on things like GPTs rather than the ultimate goal of AGI.