John_Maxwell comments on Slightly against aligning with neo-luddites

John_Maxwell 27 Dec 2022 4:25 UTC
3 points
1 ∶ 0

Our laws are the end result of literally thousands of years of of experimentation

The distribution of legal cases involving technology over the past 1000 years is very different than the distribution of legal cases involving technology over the past 10 years. “Law isn’t keeping up with tech” is a common observation nowadays.

a literal random change to the status quo

How about we revise to “random viable legislation” or something like that. Any legislation pushed by artists will be in the same reference class as the “thousands of years of of experimentation” you mention (except more recent, and thus better adapted to current reality).

AI is so radically outside the ordinary reference class of risks that it is truly nothing whatsoever like we have ever witnessed or come across before

Either AI will be transformative, in which case this is more or less true, or it won’t be transformative, in which case the regulations matter a lot less.

Suppose that as a result of neo-luddite sentiment, the people hired to oversee AI risks in the government concern themselves only with risks to employment, ignoring what we’d consider to be more pressing concerns.

If we’re involved in current efforts, maybe some of the people hired to oversee AI risks will be EAs. Or maybe we can convert some “neo-luddites” to our point of view.

simply hire right-minded people in the first place

Sounds to me like you’re letting the perfect be the enemy of the good. We don’t have perfect control over what legislation gets passed, including this particular legislation. Odds are decent that the artist lobby succeeds even with our opposition, or that current legislative momentum is better aligned with humanity’s future than any legislative momentum which occurs later. We have to think about the impact of our efforts on the margin, as opposed to thinking of a “President Matthew Barnett” scenario.

On the other hand, I’m quite convinced that, abstractly, it is highly implausible that arbitrarily limiting what data researchers have access to will be positive for alignment.

It could push researchers towards more robust schemes which work with less data.

I want a world where the only way for a company like OpenAI to make ChatGPT commercially useful is to pioneer alignment techniques that will actually work in principle. Throwing data & compute at ChatGPT until it seems aligned, the way OpenAI is doing, seems like a path to ruin.

As an intuition pump, it seems possible to me that a solution for adversarial examples would make GPT work well even when trained on less data. So by making it easy to train GPT on lots of data, we may be letting OpenAI neglect adversarial examples. We want an “alignment overhang” where our alignment techniques are so good that they work even with a small dataset, and become even better when used with a large dataset. (I guess this argument doesn’t work in the specific case of safety problems which only appear with a large dataset, but I’m not sure if there’s anything like that.)

Another note: I’ve had the experience of sharing alignment ideas with OpenAI staff. They responded by saying “what we’re doing seems good enough” / not trying my idea (to my knowledge). Now they’re running into problems which I believe the ideas I shared might’ve solved. I wish they’d focus more on finding a solid approach, and less on throwing data at techniques I view as subpar.