What might an empirical investigation of alignment look like? Can you illustrate with an example?
Both Redwood and Anthropic have labs and do empirical work. This is also an example of experimental work: https://twitter.com/Karolis_Ram/status/1540301041769529346
Current theme: default
Less Wrong (text)
Less Wrong (link)
Arrow keys: Next/previous image
Escape or click: Hide zoomed image
Space bar: Reset image size & position
Scroll to zoom in/out
(When zoomed in, drag to pan; double-click to close)
Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).
]
Keys shown in grey (e.g., ?) do not require any modifier keys.
?
Esc
h
f
a
m
v
c
r
q
t
u
o
,
.
/
s
n
e
;
Enter
[
\
k
i
l
=
-
0
′
1
2
3
4
5
6
7
8
9
→
↓
←
↑
Space
x
z
`
g
What might an empirical investigation of alignment look like? Can you illustrate with an example?
Both Redwood and Anthropic have labs and do empirical work. This is also an example of experimental work: https://twitter.com/Karolis_Ram/status/1540301041769529346