How about reducing the number of catered meals while increasing support for meals outside the venue? Silly example: someone could fill a hotel room with Soylent so that everyone can grab liquid meals and go chat somewhereâsort of a âbaguettes and hummusâ vibe. Or as @Matt_Sharp pointed out, we could reserve nearby restaurants. No idea if these exact plans are feasible, but I can imagine similarly scrappy solutions going well if planned by actual logistics experts.
Thanks so much for your work and this information!
tae đ¸
All Tech is HuÂman <-> EA
[Question] InÂtro to AI risk for AI grad stuÂdents?
JourÂnalÂisÂtic EsÂsay ConÂtest for US High School Students
Iâm having an ongoing discussion with a couple professors and a PhD candidate in AI about âThe Alignment Problem from a Deep Learning Perspectiveâ by @richard_ngo, @Lawrence Chan, and @SoerenMind. They are skeptical of â3.2 Planning Towards Internally-Represented Goals,â â3.3 Learning Misaligned Goals,â and â4.2 Goals Which Motivate Power-Seeking Would Be Reinforced During Trainingâ. Hereâs my understanding of some of their questions:
The argument for power-seeking during deployment depends on the model being able to detect the change from the training to deployment distribution. Wouldnât this require keeping track of the distribution thus far, which would require memory of some sort, which is very difficult to implement in the SSL+RLHF paradigm?
What is the status of the model after the SSL stage of training?
How robust could its goals be?
Would a model be able to know:
what misbehavior during RLHF fine-tuning would look like?
that it would be able to better achieve its goals by avoiding misbehavior during fine-tuning?
Why would a model want to preserve its weights? (Sure, instrumental convergence and all, but whatâs the exact mechanism here?)
To what extent would all these phenomena (situationally-aware reward hacking, misaligned internally-represented goals, and power-seeking behaviors) show up in current LLMs (say, GPT-4) vs. current agentic LLM-based systems (say, AutoGPT) vs. different future systems?
Do we get any evidence for these arguments from the fact that existing LLMs can adopt goal-directed personas?
Iâm guessing this has been discussed in the animal welfare movement somewhere
Yep, The Sexual Politics of Meat by Carol J. Adams is the classic Iâm aware of.
I recorded the rough audio and passed it along to the audio editor, but I havenât heard back since then :(
Hi! I relate so much to you. Iâm seven years older than you and Iâm pretty happy with how my life is going, so although Iâm no wise old sage, I think I can share some good advice.
Iâve also been involved in EA, Buddhism, veganism, minimalism, sustainable fashion, etc. from a young age, plus I was part of an Orthodox Christian community as a teenager (as I assume you are in Greece).
So, hereâs my main advice.
The philosophies of EA, Buddhism, etc. are really really morally demanding. Working from the basic principles of these philosophies, it is difficult to find reasons to prioritize your own wellbeing; there are only pragmatic reasons such as âdevote time and money to your own health so that you can work more effectively to help othersâ. Therefore, if you predominantly engage in these communities through the philosophy, you will be exhausted.
So, instead of going down internet rabbit holes and reading serious books, engage with the people in these communities. Actual EAs goof around at parties and write stories. Actual Buddhists have silly arguments at nice restaurants and go on long treks through the mountains. While good philosophies are optimized to be hard to argue with, good communities are optimized to be healthy and sustainable.
Iâm guessing you donât have strong EA and Buddhist communities near you, though. Same here. In that case, primarily engage in other communities instead. When I was your age (ha that sounds ridiculous), I was deeply involved in choir. Would highly recommend! Having fun is so important to balance out the philosophies that can consume your life if you let them.
In non-EA non-Buddhist communities, it might feel like youâre the only one who takes morality seriously, and that can be lonely. Personally, I gravitate toward devout religious friends, because theyâre also trying to confront selfishness. Just make sure that you donât go into depressing rabbit holes together.
Of course, there are nice virtual EA and Buddhist communities too. They canât fully replace in-person communities, though. Also, people in virtual communities are more likely to only show their morally intense side.
I hope this helps! Youâre very welcome to DM me about anything. Iâll DM you first to get the conversation going.
P. S. Youâve got soooo much time to think about monasticism, so thereâs no reason to be concerned about the ethics of it for now, especially since the world could change so much by the time we retire! Still, just for the philosophical interest of it, Iâm happy to chat about Buddhist monasticism if you like. Having lived at a monastery for several months and written my undergrad thesis on a monastic text, Iâve got some thoughts :)
General information about people in low-HDI countries to humanize them in the eyes of the viewer.
Similar for animals (except not âhumanizingâ per se!). Spreading awareness that e.g. pigs act like dogs may be a strong catalyst for caring about animal welfare. Would need to consult an animal welfare activism expert.
My premise here: it is valuable for EAs to viscerally care about others (in addition to cleverly working toward a future that sounds neat).
Yes, I am pretty amused about this
Iâll just continue my anecdote! As it happens, the #1 concern that my friend has about EA is that EAs work sinisterly hard to convince people to accept the narrow-minded longtermist agenda. So, the frequency of ads itself increases his skepticism of the integrity of the movement. (Another manifestation of this pattern is that many AI safety researchers see AI ethics researchers as straight-up wrong about what matters in the broader field of AI, and therefore need to be convinced rather than collaborated with.)
(Edit: the above paragraph is an anecdote, and Iâm speaking generally in the following paragraphs)
I think it is quite fair for someone with EA tendencies, who is just hearing of EA for the first time through these ads, to form a skeptical first impression of a group that invests heavily in selling an unintuitive worldview.
I strongly agree that itâs a good sign if a person investigates such things instead of writing them off immediately, indicating a willingness to take unusual ideas seriously. However, the mental habit of openness/âcuriosity is also unusual and is often developed through EA involvement; we canât expect everyone to come in with full-fledged EA virtues.
Sure! Thank you very much for your, ahem, forethought about this complicated task. Please pardon the naive post about a topic that you all have worked hard on already :)
âCall off the EAsâ: Too Much AdÂverÂtisÂing?
These are excellent answers, thanks so much!
As more and more students get interested in AI safety, and AI-safety-specific research positions fail to open up proportionally, I expect that many of them (like me) will end up as graduate students in mainstream ethical-AI research groups. Resources like these are helping me to get my bearings.
Thanks very much, that helps!
Adding more not to defend myself, but to keep the conversation going:
I think that many Enlightenment ideas are great and valid regardless of their creatorsâ typical-for-their-time ideas.
Education increasingly includes rather radical components of critical race theory. Students are taught that if someone is racist, then all of their political and philosophical views are tainted. By extension, many people learn that the Enlightenment itself is tainted. Like Charles, I think that this âproduces misguided perspectivesâ.
Iâmâapparently badlyâtrying to communicate the following. These students, who have been taught that the Enlightenment is tainted by association with racism, who (reasonably!) havenât bothered to thoroughly research this particular historical movement to come to their own conclusions, who may totally make great EAs, would initially be turned off.
Itâs quite plausible that it shouldnât be the case that Enlightenment aesthetics might turn people off. But I think this is the case, and I argue that itâs likely more important to make a good first impression than to take a stand in favor of a particular historical movement.
Hope that makes sense!
Could someone who downvoted please explain which of these premises you disagree with?
Short version: if we can avoid it, letâs not filter potential EAs by the warmth of their feelings toward a specific group of historical figures (especially because history education is inevitably biased)
I actually wouldnât know where to find a liberal student who respects classics (let alone âour cultural heritageâ) at my large American university, after four years in the philosophy department!
Yes, these are great reasons to take inspiration from the Enlightenment!
The point I most want to get across is that, by using Enlightenment aesthetics, EAs could needlessly open themselves up to negative perception.
If EAs use Enlightenment aesthetics more, then EA will be associated with the Enlightenment more.
Regardless of their positive qualities, Enlightenment philosophers racked up plenty of negative qualities among them. Maybe there were 10x as many purely virtuous ones as problematic ones; maybe every problematic one made contributions that vastly outweighed their issues; nonetheless, there are some problems.
People who would otherwise engage with EA might have heard of enough problems that theyâd be put off by Enlightenment associations entirely. (I suspect many of my social-justice-y friends would have this reaction.)
Hereâs the more nebulous point. I hinted in my original comment that I take issue with the ârational individualistic actorâ view. This alone puts me off Enlightenment aesthetics, because I think that particular view is especially dangerous considering how innocent it looks. But thatâs a whole big discussion, and I respect the other side! The relevant part here is just that, anecdotally, at least one EA isnât a huge Enlightenment fan.
Please accept my delayed gratitude for the comprehensive response! The conversation continues with my colleagues. The original paper, plus this response, have become pretty central to my thinking about alignment.