This other Ryan Greenblatt is my old account[1]. Here is my LW account.
- ^
Account lost to the mists of time and expired university email addresses.
This other Ryan Greenblatt is my old account[1]. Here is my LW account.
Account lost to the mists of time and expired university email addresses.
This is an old thread, but I’d like to confirm that a high fraction of my motivation for being vegan[1] is signaling to others and myself. (So, n=1 for this claim.) (A reasonable fraction of my motivation is more deontological.)
I eat fish rarely as I was convinced that the case for this improving productivity is sufficiently strong.
I suppose the complement to the naive thing I said before is “80k needs a compelling reason to recruit people to EA, and needs EA to be compelling to the people to recruit to it as well; by doing an excellent job at some object-level work, you can grow the value of 80k recruiting, both by making it easier to do and by making the outcome a more valuable outcome. Perhaps this might be even better for recruiting than doing recruiting.”
I think there are a bunch of meta effects from working in an object level job:
The object level work makes people more likely to enter the field as you note. (Though this doesn’t just route through 80k and goes through a bunch of mechanisms.)
You’ll probably have some conversations with people considering entering the field from a slightly more credible position at least if the object level stuff goes well.
Part of the work will likely involve fleshing stuff out so people with less context can more easily join/contribute. (True for most / many jobs.)
I think people wouldn’t normally consider it Pascalian to enter a postive total returns lottery with a 1 / 20,000 (50 / million) chance of winning?
And people don’t consider it to be Pascalian to vote, to fight in a war, or to advocate for difficult to pass policy that might reduce the chance of nuclear war?
Maybe you have a different-than-typical perspective on what it means for something to be Pascalian?
I agree that it is a poor analogy for AI risk. However, I do think it is a semi-reasonable intuition pump for why AIs that are very superhuman would be an existential problem if misaligned (and without other serious countermeasures).
I think that the political activation of Silicon Valley is the sort of thing which could reshape american politics, and that twitter is a leading indicator.
I don’t disagree with this statement, but also think the original comment is reading into twitter way too much.
I haven’t seen those comments
Scroll down to see comments.
Once again, if you disagree, I’d love to actually here why.
I think you’re reading into twitter way too much.
absence of evidence of good arguments against it is evidence of the absence of said arguments. (tl;dr—AI Safety people, engage with 1a3orn more!)
There are many (edit: 2) comments responding and offering to talk. 1a3orn doesn’t appear to have replied to any of these comments. (To be clear, I’m not saying they’re under any obligation here, just that there isn’t a absence of attempted engagement and thus you shouldn’t update in the direction you seem to be updating here.)
The limited duty exemption has been removed from the bill which probably makes compliance notably more expensive while not improving safety. (As far as I can tell.)
This seems unfortunate.
I think you should still be able to proceed in a somewhat reasonable way by making a safety case on the basis of insufficient capability, but there are still additional costs associated with not getting an exemption.
Further, you can’t just claim an exemption prior to starting training if you are behind the frontier which will substantially increase the costs on some actors.
This makes me more uncertain about whether the bill is good, though I think it will probably still be net positive and basically reasonable on the object level. (Though we’ll see about futher amendments, enforcement, and the response from society...)
(LW x-post)
I agree that these models assume something like “large discontinuous algorithmic breakthroughs aren’t needed to reach AGI”.
(But incremental advances which are ultimately quite large in aggregate and which broadly follow long running trends are consistent.)
However, I interpreted “current paradigm + scale” in the original post as “the current paradigm of scaling up LLMs and semi-supervised pretraining”. (E.g., not accounting for totally new RL schemes or wildly different architectures trained with different learning algorithms which I think are accounted for in this model.)
Both AI doomers and accelerationists will come out looking silly, but will both argue that we are only an algorithmic improvement away from godlike AGI.
A common view is a median around 2035-2050 with substantial (e.g. 25%) mass in the next 6 years or so.
This view is consistent with both thinking:
LLM progress is likely (>50%) to stall out.
LLMs are plausibly going to quickly scale into very powerful AI.
(This is pretty similar to my view.)
I don’t think many people think “we are only an algorithmic improvement away from godlike AGI”. In fact, I can’t think of anyone who thinks this. Some people think that 1 substantial algorithmic advance + continued scaling/general algorithmic improvement, but the continuation of other improvements is key.
Yes, I meant central to me personally, edited the comment to clarify.
I basically agree with this with some caveats. (Despite writing a post discussing AI welfare interventions.)
I discuss related topics here and what fraction of resources should go to AI welfare. (A section in the same post I link above.)
The main caveats to my agreement are:
From a deontology-style perspective, I think there is a pretty good case for trying to do something reasonable on AI welfare. Minimally, we should try to make sure that AIs consent to their current overall situation insofar as they are capable of consenting. I don’t put a huge amount of weight on deontology, but enough to care a bit.
As you discuss in the sibling comment, I think various interventions like paying AIs (and making sure AIs are happy with their situation) to reduce takeover risk are potentially compelling and they are very similar to AI welfare interventions. I also think there is a weak decision theory case that blends in with deontology case from the prior bullet.
I think that there is a non-trivial chance that AI welfare is a big and important field at the point when AIs are powerful regardless of whether I push for such a field to exist. In general, I would prefer that important fields related to AI have better more thoughtful views. (Not with any specific theory of change, just a general heuristic.)
My impression is these arguments are important to very few AI-welfare-prioritizers
FWIW, these motivations seem reasonably central to me personally, though not my only motivations.
You might also be interested in discussion here.
You might be interested in discussion here.
We know now that a) your results aren’t technically SOTA
I think my results are probably SOTA based on more recent updates.
It’s not an LLM solution, it’s an LLM + your scaffolding + program search, and I think that’s importantly not the same thing.
I feel like this is a pretty strange way to draw the line about what counts as an “LLM solution”.
Consider the following simplified dialogue as an example of why I don’t think this is a natural place to draw the line:
Human skeptic: Humans don’t exhibit real intelligence. You see, they’ll never do something as impressive as sending a human to the moon.
Humans-have-some-intelligence advocate: Didn’t humans go to the moon in 1969?
Human skeptic: That wasn’t humans sending someone to the moon that was Humans + Culture + Organizations + Science sending someone to the moon! You see, humans don’t exhibit real intelligence!
Humans-have-some-intelligence advocate: … Ok, but do you agree that if we removed the Humans from the overall approach it wouldn’t work.
Human skeptic: Yes, but same with the culture and organization!
Humans-have-some-intelligence advocate: Sure, I guess. I’m happy to just call it humans+etc I guess. Do you have any predictions for specific technical feats which are possible to do with a reasonable amount of intelligence that you’re confident can’t be accomplished by building some relatively straightforward organization on top of a bunch of smart humans within the next 15 years?
Human skeptic: No.
Of course, I think actual LLM skeptics often don’t answer “No” to the last question. They often do have something that they think is unlikely to occur with a relatively straightforward scaffold on top of an LLM (a model descended from the current LLM paradigm, perhaps trained with semi-supervised learning and RLHF).
I actually don’t know what in particular Chollet thinks is unlikely here. E.g., I don’t know if he has strong views about the performance of my method, but using the SOTA multimodal model in 2 years.
Tom Davidson’s model is often referred to in the Community, but it is entirely reliant on the current paradigm + scale reaching AGI.
This seems wrong.
It does use constants from the historical deep learning field to provide guesses for parameters and it assumes that compute is an important driver of AI progress.
These are much weaker assumptions than you seem to be implying.
Note also that this work is based on earlier work like bio anchors which was done just as the current paradigm and scaling were being established. (It was published in the same year as Kaplan et al.)
But it won’t do anything until you ask it to generate a token. At least, that’s my intuition.
I think this seems like mostly a fallacy. (I feel like there should be a post explaning this somewhere.)
Here is an alternative version of what you said to indicate why I don’t think this is a very interesting claim:
Sure you can have a very smart quadriplegic who is very knowledgable. But they won’t do anything until you let them control some actuator.
If your view is that “prediction won’t result in intelligence”, fair enough, though its notable that the human brain seems to heavily utilize prediction objectives.
I don’t think non-myopia is required to prevent jailbreaks. A model can in principle not care about the effects of training on it and not care about longer term outcomes while still implementing a policy that refuses harmful queries.
I think we should want models to be quite deontological about corrigibility.
This isn’t responding to this overall point and I agree by default there is some tradeoff (in current personas) unless you go out of your way to avoid this.
(And, I don’t think training your model to seem myopic and corrigible necessarily suffices as it could just be faked!)