Charles He comments on On Deference and Yudkowsky’s AI Risk Estimates

Charles He Jun 20, 2022, 5:58 AM
4 points
0 ∶ 0

Level 3″ is described as “Armed with this ability you can learn not just from your own experience, but from the experience of others—you can identify successful others and imitate them.” This doesn’t seem like what any ML model does.

This sounds like straightforward transfer learning (TL) or fine tuning, common in 2017.

So you could just write 15 lines of python which shops between some set of pretrained weights and sees how they perform. Often TL is many times (1000x) faster than random weights and only needs a few examples.

As speculation: it seems like in one of the agent simulations you can just have agents grab other agents weights or layers and try them out in a strategic way (when they detect an impasse or new environment or something). There is an analogy to biology where species alternate between asexual vs sexual reproduction, and trading of genetic material occurs during periods of adversity. (This is trivial, I’m sure a second year student has written a lot more.)

This doesn’t seem to fit any sort of agent framework or improve agency though. It just makes you train faster.
- Charles He Jun 20, 2022, 7:06 AM
  2 points
  0 ∶ 0
  Parent
  Eh, there seems like a connection to interpretability.
  
  For example, if the ML architecture “were modular+categorized or legible to the agents”, they would more quickly and effectively swap weights or models.
  
  So there might be some way where legibility can emerge by selection pressure in an environment where say, agents had limited capacity to store weights or data, and had to constantly and extensively share weights with each other. You could imagine teams of agents surviving and proliferating by a shared architecture that let them pass this data fluently in the form of weights.
  
  To make sure the transmission mechanism itself isn’t crazy baroque you can, like, use some sort of regularization or something.
  
  I’m 90% sure this is a shower thought but like it can’t be worse than “The Great Reflection”.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer