Arthur Conmy

Karma: 9

Arthur Conmy Sep 16, 2023, 10:28 PM
2 points
2 ∶ 1
on: AI Pause Will Likely Backfire
I think this post provides some pretty useful arguments about the downsides of pausing AI development. I feel noticeably more pessimistic about a pause going well having read this.
However, I don’t agree with some of the arguments about alignment optimism and think they’re a fair bit weaker
When it comes to AIs, we are the innate reward system
Sure, we can use RLHF/related techniques to steer AI behavior. Further,
[gradient descent] is almost impossible to trick
Sure, unlike in most cases in biology, ANN updates do act on the whole model without noise etc.
But there are worries about what happens when AIs get predictably harder to evaluate as they reach superhuman performance on more tasks that are still very real given all of this! You mention scalable oversight research so it’s clear you are aware that this is an open problem, but I don’t think this post emphasises enough how most alignment work recognises a pretty big difference between aligning subhuman systems and superhuman systems, which limits the optimism you can get from GPT-4 seeming basically aligned. I think it’s possible that with tons of compute and aligned weaker AIs (as you touch upon) we can generalize to aligned GPT-5, GPT-6 etc. But this feels like a pretty different paradigm to the various analogies to the natural world and the current state of alignment!

Arthur Conmy Jun 16, 2023, 4:22 PM
2 points
0 ∶ 0
in reply to: Lara_TH’s comment on: Critiques of prominent AI safety labs: Conjecture
On a macro-level you could consider extreme AI Safety asks followed by moderate asks to be an example of the Door-in-the-face technique (which has a psychological basis and seems to have replicated)

Arthur Conmy May 3, 2023, 2:11 PM
1 point
0 ∶ 0
in reply to: Lizka’s comment on: AGI safety career advice
CC https://www.lesswrong.com/posts/fqryrxnvpSr5w2dDJ/touch-reality-as-soon-as-possible-when-doing-machine that expands on “hands-on” experience in alignment.
I don’t know of any writing that directly contradicts these claims. I think https://www.lesswrong.com/s/v55BhXbpJuaExkpcD/p/3pinFH3jerMzAvmza indirectly contradicts these claims as it broadly criticizes most empirical approaches and is more open to conceptual approaches.

Arthur Conmy Nov 25, 2022, 11:03 PM
1 point
0 ∶ 0
on: What’s the best machine learning newsletter? How do you keep up to date?
For capabilities things, https://dblalock.substack.com/ is pretty good (though some things the author is very excited about I find underwhelming).

EDIT: weekly quick summaries of papers

Arthur Conmy Oct 9, 2022, 10:21 PM
1 point
0 ∶ 0
on: Listen to more EA content with The Nonlinear Library
There are some recent posts, for example this one that are just the intro and outro (22 seconds long) and miss the main post. Would be great if this bug could be fixed.

Arthur Conmy Sep 21, 2022, 8:01 PM
2 points
1 ∶ 0
on: What Do AI Safety Pitches Not Get About Your Field?
What were/are your basic and relevant questions? What were AIS folks missing?

Arthur Conmy Aug 13, 2022, 7:58 PM
1 point
0 ∶ 0
on: Punching Utilitarians in the Face
I liked this post because of I’ve been thinking about similar issues recently, but find some of the conclusions strange. For example, isn’t there a “generalised trolley problem” for any deontologist who asserts that rule X should be followed:
Aha! So you follow rule X? Well what if I told you that person over there will violate rule X twice unless you break rule X in the next 5 minutes?
?

Why is this relevant? I don’t think at this point the deontologist holds up their hands upon hearing any example of the above and denounces their theory. I think they add another rule that allows them to violate their former rule*. I think more needs to be done to prove that the boundary cases for utilitarianism are wild, but they are not out of the ordinary for deontological ethics.

* and I see this as about as wild as when the utilitarian doesn’t voluntarily harvest organs because of “societal factors”, and has to add this to their utility function (here: https://www.utilitarianism.net/objections-to-utilitarianism/rights)

Arthur Conmy Jul 17, 2022, 8:33 PM
1 point
0 ∶ 0
in reply to: Dan Wahl’s comment on: Listen to more EA content with The Nonlinear Library
This is great and you should make a LW post; these are in a really nice format for shunting around.
As a small nit: any idea why the first few essays of the Codex (https://www.lesswrong.com/codex) are not here?

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer