Can you write about cross pollination between technical safety and AI governance and policy? In the case of the new governance mechanisms role (zeroing in on proof of learning and other monitoring schemas), it seems like bridging or straddling the two teams is important.
quinn
Yes. In general I’d love to see him try harder to write out tenure overhauls. He seems to stay on a popsci / blogger register instead of designing a new protocol / incentive scheme and shopping it around at metascience conferences.
I’m pro dismissiveness of sneer/dunk culture (most of the time it comes up), but I think the CoI thing about openphil correlated investments/board seats/marriage is a very reasonable thing to say and is not sneer/dunk material. I get the sense from what’s been written publicly that openphil has tried their best to not be manipulating horizon fellows into some parochial/selfish gains for sr openphil staff, but I don’t think that people who are less trusting than me about this are inherently acting in bad faith.
In an “isolated demand for rigor” sense it may turn back into opportunism or sneer/dunk---- I kinda doubt that any industry could do anything ever without a little corruption, or a considerable risk of corruption, especially new industries. (i.e. my 70% hunch is that if an honest attempt to learn about the reference class of corporation and foundation partnerships wining and dining people on the hill and consulting on legislation was conducted, these risks from horizon in particular would not look unusually dicey. I’d love for someone to update me in either direction). But we already know that we can’t trust rhetorical strategies in environments like this.
Yeah—there’s an old marxist take that looks like “religion is a fake proxy battle for eminent domain and ‘separate but equal’ style segregation” that I always found compelling. I can’t imagine it’s 100% true, but Yovel implies that it’s 100% false.
Thanks guys! I support what you were saying about application timeliness, expectation management, etc. This seems like a super reasonable set of norms.
I overall thought you crushed it, no notes, etc. My literal only grievance was that there were a bajillion forlorn nametags on the table of people I specifically had important world-saving business to check in with on, so the no-shows definitely lowered my productivity at the conf. I ended up being surprised by great discussions that popped up in spite of not having the ones I had hoped to, so I’m not complaining.
I was one of the “not supposed to be on that coast that weekend” people who had a bunch of stuff fall apart / come together at the last minute, it literally was the wednesday or thursday of the week itself that I was confident I would not be tied up in california---- so I’m wondering, should I have applied on time with an annotated application “I’m 90% sure I can’t make it, but it’ll be easier for me to be in the system already if we end up in the 10% world”? At this point, it becomes important for me to update the team so I don’t impose the costs like the forlorn nametags or anything else discussed in post, but those updates themselves increase the overall volume of comms / things to keep track of for both me and the staff, which is also a cost.
Even if my particular case is too extreme and unusual to apply to others, I hope norms or habits get formed in the territory of “trying to be thoughtful at all” cuz it sounds like we’re at the stage where we only have to be directionally correct.
I’m inferring that the kind of reasons Kurt had for not talking about this as much (in, say, earshot of me) before now are the exact same kind of reasons people are overall intimidated / feel it’s too scary to not use burner accounts for things.
Kurt’s comment is shifting my thinking a lot.
- Oct 6, 2023, 6:27 PM; 6 points) 's comment on Linking Alt Accounts by (LessWrong;
I take it as a kind of “what do known incentives do and neglect to do” ---- when I say “default” I mean “without philanthropic pressure” or “well-aligned with making someone rich”. Of course, a lot of this depends on my background understanding of public-private partnerships through the history of innovation (something I’m liable to be wrong about).
The standard venn diagram of focused research organizations https://fas.org/publication/focused-research-organizations-a-new-model-for-scientific-research/ gives a more detailed view along the same lines, less clumsy, but still the point is “there are blindspots that we don’t know how to incentivize”.
It’s certainly true that many parts of almost every characterization/definition of “alignment” can simply be offloaded to capitalism, but I think there are a bajillion reasonable and defensible views about which parts those are, if they’re hard, they may be discovered in an inconvenient order, etc.
Huh. When I was in singapore I felt like I was getting a deeper view of chinese cuisine than any knowledge I had acquired in the states, but I still didn’t get into like game-changingly new ways of viewing tofu in particular.
openphil did some lost wages stuff after FTXsplosion, but I think evaluated case by case and some people may have been left behind.
Slightly conflicted agree vote: your model here offloads so much to judgment calls that fall on people who are vulnerable to perverse incentives (like, alignment/capabilities as a binary distinction is a bad frame, but it seems like anyone who’d be unusually well suited to thinking clearly about it’s alternatives make more money and have less stressful lives if their beliefs fall some ways vs others).
Other than that, I’m aware that no one’s really happy about the way they tradeoff “you could copenhagen ethics your way out of literally any action in the limit” against “saying that the counterfactual a-hole would do it worse if I didn’t is not a good argument”. It seems like a law of opposite advice situation, maybe? As in some people in the blase / unilateral / powerhungry camp could stand to be nudged one way and some people in the scrupulous camp could stand to be nudged another.
It also matters that the “oppose carbon capture or nuclear energy because it might make people feel better without solving the ‘real problem’.” environmentalists have very low standards even when you condition on them being environmentalists. That doesn’t mean they can’t be memetically adaptive and then influential, but it might be tactically important (i.e. you have a messaging problem instead of a more virtuous actually-trying-to-think-clearly problem)
There would be some UX ways to make community clout feel lower status than the other clout, I agree with you that having community clout means more investment / should be preferred over a new account which for all you know is a driveby dunk/sneer after wandering in on twitter.
I’ll cc this to my feature request in the proper thread.
Can my profile karma total be two numbers, one for community and one for other stuff? I don’t want a reader to think my actual work is valuable to people in proportion to my EA Forum karma, as far as I can tell I think 3-5x my karma is community sourced compared to my object-level posts. People should look at my profile as “this guy procrastinates through PVP on social media like everyone else, he should work harder on things that matter”.
Michael Lewis wouldn’t do it as a gotcha/sneer, but this is a reason I’ll be upset if Adam McKay ends up with the movie.
(I forgot to tell JP and Lizka in at EAG in NY a few weeks ago, but now’s as good a time as any):
Can my profile karma total be two numbers, one for community and one for other stuff? I don’t want a reader to think my actual work is valuable to people in proportion to my EA Forum karma, as far as I can tell I think 3-5x my karma is community sourced compared to my object-level posts. People should look at my profile as “this guy procrastinates through PVP on social media like everyone else, he should work harder on things that matter”.
replaceability misses the point (with why EAs skew heavily on not liking protests). it’s way more an epistemics issue—messaging and advocacy are just deeply corrosive under any reasonable way of thinking about uncertainty.
In my sordid past I did plenty of “finding the three people for nuanced logical mind-changing discussions amidst a dozens of ‘hey hey ho ho outgroup has got to go’”, so I’ll do the same here (if I’m in town), but selection effects seem deeply worrying (for example, you could go down to the soup kitchen or punk music venue and recruit all the young volunteers who are constantly sneering about how gentrifying techbros are evil and can’t coordinate on whether their “unabomber is actually based” argument is ironic or unironic, but you oughtn’t. The fact that this is even a question, that if you have a “mass movement” theory of change you’re constantly temped to lower your standards in this way, is so intrinsically risky that no one should be comfortable that ML safety or alignment is resorting to this sort of thing).
I don’t know why people overindex on loud grumpy twitter people. I haven’t seen evidence that most FAccT attendees are hostile and unsophisticated.
I would like for XPT to be assembled with the sequence functionality plz https://forum.effectivealtruism.org/users/forecasting-research-institute I like the sequence functionality for keeping track of my progress or in what order did I read things
It seems like a super quick habit-formation trick for a bunch of socioepistemic gains is just saying “that seems overconfident”. The old Sequences/Methods version is “just what do you think you know, and how do you think you know it?”
A friend was recently upset about his epistemic environment, like he didn’t feel like people around him were able to reason and he didn’t feel comfortable defecting on their echo chamber. I found it odd that he said he felt like he was the overconfident one for doubting the reams of overconfident people around him! So I told him, start small, try just asking people if they’re really as confident as they sound.
In my experience, it’s a gentle nudge that helps people be better versions of themselves. Tho I said “it seems” cuz I don’t know how many different communities it work would reliably in—the case here is someone almost 30 in a nice college with very few grad students in an isolated town.