Not super EA relevant, but I guess relevant inasmuch as Moskovitz funds us and Musk has in the past too. I think if this were just some random commentator I wouldn’t take it seriously at all, but a bit more inclined to believe Dustin will take some concrete action. Not sure I’ve read everything he’s said about it, I’m not used to how Threads works
The “non-tweet” feels vague and unsubsantiated (at this point anyway). I hope we’ll get a full article and explanation as to what he means exactly because obviously he’s making HUGE calls.
Very few of my peers are having kids. My husband and I are the youngest parents at the Princeton University daycare at 31 years old. The next youngest parent is 3 years older than us, and his kid is a year younger than ours. Considering median age of first birth at the national level is 30 years old, it seems like a potential problem that the national median is the Princeton minimum.
I wonder what the birth rate is specifically among American parents with/doing STEM PhDs. I’m guessing it’s extremely low for people under the age of 45. Possibly low enough to raise concerns about how scientists are not procreating anymore.
Princeton is raising annual stipends to about $45,000. Two graduate student parents now have a reasonable combined household income, especially if they can live in subsidized student housing. I wonder if this will make a big difference in Princeton fertility rates.
On the other hand, none of my NYC friends making way over $90,000 have kids, so this might be a deeper cultural problem.
To be clear, I don’t think people who don’t want to have kids should have them, or that they’re being “selfish” or whatever. But societies without children will literally die, so it’s concerning that American society has such strong anti-natal sentiment. Especially if it’s the part of American society with some of the smartest people who are more motivated by truth seeking than money.
Some of this seems to be inherent to a modern society (High birth rates in past society were because of high mortality rates, women being treated as baby factories, etc.), but in my own experience the reason the birth rate is so low is that people simply can’t afford to have children.
In Japan and South Korea, the “salaryman culture” is such that employees are expected to devote their entire lives to their employers, to the extent of sleeping in the office at times. Needless to say, this makes it extremely difficult to have a relationship.
In short, wealth inequality and a society that’s entirely focused on the generation of profit will both cause catastrophically low birth rates. I may be biased here, but then again it’s exactly these situations that convinced me that our current economic system has outlived its usefulness.
One (probably awful) idea I’ve been playing around with is scaling up parenting.
Say, find some good people (maybe couples) who care about education and love raising kids, and fund them to raise a lot of kids with strong genetic potential.
There may be ways to raise them to be great people (e.g. this Future Perfect piece) and with devoted parenting it might be possible to raise them to be “expert do-gooders” (thinking of the Polgar sisters).
A corporation exhibits emergent behavior, over which no individual employee has full control. Because the unregulated market selects for profit and nothing else, any successful corporation becomes a kind of “financial paperclip optimizer”. To prevent this, the economic system must change.
Paul Graham about getting good at technology (bold is mine):
How do you get good at technology? And how do you choose which technology to get good at? Both of those questions turn out to have the same answer: work on your own projects. Don’t try to guess whether gene editing or LLMs or rockets will turn out to be the most valuable technology to know about. No one can predict that. Just work on whatever interests you the most. You’ll work much harder on something you’re interested in than something you’re doing because you think you’re supposed to.
If you’re not sure what technology to get good at, get good at programming. That has been the source of the median startup for the last 30 years, and this is probably not going to change in the next 10.
From “HOW TO START GOOGLE”, March 2024. It’s a talk for ~15 year olds, and it has more about “how to get good at technology” in it.
Everyone who seems to be writing policy papers/ doing technical work seems to be keeping generative AI at the back of their mind, when framing their work or impact.
This narrow-eyed focus on gen AI might almost certainly be net-negative for us- unknowingly or unintentionally ignoring ripple effects of the gen AI boom in other fields (like robotics companies getting more funding leading to more capabilities, and that leads to new types of risks).
And guess who benefits if we do end up getting good evals/standards in place for gen AI? It seems to me companies/investors are clear winners because we have to go back to the drawing board and now advocate for the same kind of stuff for robotics or a different kind of AI use-case/type all while the development/capability cycles keep maturing.
We seem to be in whack-a-mole territory now because of the overton window shifting for investors.
This WHO press release was a good reminder of the power of immunization – a new study forthcoming publication in The Lancet reports that (liberally quoting / paraphrasing the release)
global immunization efforts have saved an estimated 154 million lives over the past 50 years, 146 million of them children under 5 and 101 million of them infants
for each life saved through immunization, an average of 66 years of full health were gained – with a total of 10.2 billion full health years gained over the five decades
measles vaccination accounted for 60% of the lives saved due to immunization, and will likely remain the top contributor in the future
vaccination against 14 diseases has directly contributed to reducing infant deaths by 40% globally, and by more than 50% in the African Region
the 14 diseases: diphtheria, Haemophilus influenzae type B, hepatitis B, Japanese encephalitis, measles, meningitis A, pertussis, invasive pneumococcal disease, polio, rotavirus, rubella, tetanus, tuberculosis, and yellow fever
fewer than 5% of infants globally had access to routine immunization when the Expanded Programme on Immunization (EPI) was launched 50 years ago in 1974 by the World Health Assembly; today 84% of infants are protected with 3 doses of the vaccine against diphtheria, tetanus and pertussis (DTP) – the global marker for immunization coverage
there’s still a lot to be done – for instance, 67 million children missed out on one or more vaccines during the pandemic years
Egg Innovations announced that they are “on track to adopt the technology in early 2025.” Approximately 300 million male chicks are ground up alive in the US each year (since only female chicks are valuable) and in-ovo sexing would prevent this.
UEP originally promised to eliminate male chick culling by 2020; needless to say, they didn’t keep that commitment. But better late than never!
Congrats to everyone working on this, including @Robert—Innovate Animal Ag, who founded an organization devoted to pushing this technology.[1]
Egg Innovations says they can’t disclose details about who they are working with for NDA reasons; if anyone has more information about who deserves credit for this, please comment!
In order to be able to communicate about malaria from a fundraising perspective, it would be amazing if there would be a documentary about malaria. Personal compelling stories that anyone can relate to. Not about the science behind the disease, as that wouldn’t work probably.
Just like “An inconvenient truth”, but then around Malaria.
I am truly baffled I can’t find anything close to what I was hoping would exist already.
Anyone knows why this is? Or am I googling wrong?
In this “quick take”, I want to summarize some my idiosyncratic views on AI risk.
My goal here is to list just a few ideas that cause me to approach the subject differently from how I perceive most other EAs view the topic. These ideas largely push me in the direction of making me more optimistic about AI, and less likely to support heavy regulations on AI.
(Note that I won’t spend a lot of time justifying each of these views here. I’m mostly stating these points without lengthy justifications, in case anyone is curious. These ideas can perhaps inform why I spend significant amounts of my time pushing back against AI risk arguments. Not all of these ideas are rare, and some of them may indeed be popular among EAs.)
Skepticism of the treacherous turn: The treacherous turn is the idea that (1) at some point there will be a very smart unaligned AI, (2) when weak, this AI will pretend to be nice, but (3) when sufficiently strong, this AI will turn on humanity by taking over the world by surprise, and then (4) optimize the universe without constraint, which would be very bad for humans.
By comparison, I find it more likely that no individual AI will ever be strong enough to take over the world, in the sense of overthrowing the world’s existing institutions and governments by surprise. Instead, I broadly expect unaligned AIs will integrate into society and try to accomplish their goals by advocating for their legal rights, rather than trying to overthrow our institutions by force. Upon attaining legal personhood, unaligned AIs can utilize their legal rights to achieve their objectives, for example by getting a job and trading their labor for property, within the already-existing institutions. Because the world is not zero sum, and there are economic benefits to scale and specialization, this argument implies that unaligned AIs may well have a net-positive effect on humans, as they could trade with us, producing value in exchange for our own property and services.
Note that my claim here is not that AIs will never become smarter than humans. One way of seeing how these two claims are distinguished is to compare my scenario to the case of genetically engineered humans. By assumption, if we genetically engineered humans, they would presumably eventually surpass ordinary humans in intelligence (along with social persuasion ability, and ability to deceive etc.). However, by itself, the fact that genetically engineered humans will become smarter than non-engineered humans does not imply that genetically engineered humans would try to overthrow the government. Instead, as in the case of AIs, I expect genetically engineered humans would largely try to work within existing institutions, rather than violently overthrow them.
AI alignment will probably be somewhat easy: The most direct and strongest current empirical evidence we have about the difficulty of AI alignment, in my view, comes from existing frontier LLMs, such as GPT-4. Having spent dozens of hours testing GPT-4′s abilities and moral reasoning, I think the system is already substantially more law-abiding, thoughtful and ethical than a large fraction of humans. Most importantly, this ethical reasoning extends (in my experience) to highly unusual thought experiments that almost certainly did not appear in its training data, demonstrating a fair degree of ethical generalization, beyond mere memorization.
It is conceivable that GPT-4′s apparently ethical nature is fake. Perhaps GPT-4 is lying about its motives to me and in fact desires something completely different than what it professes to care about. Maybe GPT-4 merely “understands” or “predicts” human morality without actually “caring” about human morality. But while these scenarios are logically possible, they seem less plausible to me than the simple alternative explanation that alignment—like many other properties of ML models—generalizes well, in the natural way that you might similarly expect from a human.
Of course, the fact that GPT-4 is easily alignable does not immediately imply that smarter-than-human AIs will be easy to align. However, I think this current evidence is still significant, and aligns well with prior theoretical arguments that alignment would be easy. In particular, I am persuaded by the argument that, because evaluation is usually easier than generation, it should be feasible to accurately evaluate whether a slightly-smarter-than-human AI is taking bad actions, allowing us to shape its rewards during training accordingly. After we’ve aligned a model that’s merely slightly smarter than humans, we can use it to help us align even smarter AIs, and so on, plausibly implying that alignment will scale to indefinitely higher levels of intelligence, without necessarily breaking down at any physically realistic point.
The default social response to AI will likely be strong: One reason to support heavy regulations on AI right now is if you think the natural “default” social response to AI will lean too heavily on the side of laissez faire than optimal, i.e., by default, we will have too little regulation rather than too much. In this case, you could believe that, by advocating for regulations now, you’re making it more likely that we regulate AI a bit more than we otherwise would have, pushing us closer to the optimal level of regulation.
I’m quite skeptical of this argument because I think that the default response to AI (in the absence of intervention from the EA community) will already be quite strong. My view here is informed by the base rate of technologies being overregulated, which I think is quite high. In fact, it is difficult for me to name even a single technology that I think is currently clearly underregulated by society. By pushing for more regulation on AI, I think it’s likely that we will overshoot and over-constrain AI relative to the optimal level.
In other words, my personal bias is towards thinking that society will regulate technologies too heavily, rather than too loosely. And I don’t see a strong reason to think that AI will be any different from this general historical pattern. This makes me hesitant to push for more regulation on AI, since on my view, the marginal impact of my advocacy would likely be to push us even further in the direction of “too much regulation”, overshooting the optimal level by even more than what I’d expect in the absence of my advocacy.
I view unaligned AIs as having comparable moral value to humans: This idea was explored in one of my most recent posts. The basic idea is that, under various physicalist views of consciousness, you should expect AIs to be conscious, even if they do not share human preferences. Moreover, it seems likely that AIs — even ones that don’t share human preferences — will be pretrained on human data, and therefore largely share our social and moral concepts.
Since unaligned AIs will likely be both conscious and share human social and moral concepts, I don’t see much reason to think of them as less “deserving” of life and liberty, from a cosmopolitan moral perspective. They will likely think similarly to the way we do across a variety of relevant axes, even if their neural structures are quite different from our own. As a consequence, I am pretty happy to incorporate unaligned AIs into the legal system and grant them some control of the future, just as I’d be happy to grant some control of the future to human children, even if they don’t share my exact values.
Put another way, I view (what I perceive as) the EA attempt to privilege “human values” over “AI values” as being largely arbitrary and baseless, from an impartial moral perspective. There are many humans whose values I vehemently disagree with, but I nonetheless respect their autonomy, and do not wish to deny these humans their legal rights. Likewise, even if I strongly disagreed with the values of an advanced AI, I would still see value in their preferences being satisfied for their own sake, and I would try to respect the AI’s autonomy and legal rights. I don’t have a lot of faith in the inherent kindness of human nature relative to a “default unaligned” AI alternative.
I’m not fully committed to longtermism: I think AI has an enormous potential to benefit the lives of people who currently exist. I predict that AIs can eventually substitute for human researchers, and thereby accelerate technological progress, including in medicine. In combination with my other beliefs (such as my belief that AI alignment will probably be somewhat easy), this view leads me to think that AI development will likely be net-positive for people who exist at the time of alignment. In other words, if we allow AI development, it is likely that we can use AI to reduce human mortality, and dramatically raise human well-being for the people who already exist.
I think these benefits are large and important, and commensurate with the downside potential of existential risks. While a fully committed strong longtermist might scoff at the idea that curing aging might be important — as it would largely only have short-term effects, rather than long-term effects that reverberate for billions of years — by contrast, I think it’s really important to try to improve the lives of people who currently exist. Many people view this perspective as a form of moral partiality that we should discard for being arbitrary. However, I think morality is itself arbitrary: it can be anything we want it to be. And I choose to value currently existing humans, to a substantial (though not overwhelming) degree.
This doesn’t mean I’m a fully committed near-termist. I sympathize with many of the intuitions behind longtermism. For example, if curing aging required raising the probability of human extinction by 40 percentage points, or something like that, I don’t think I’d do it. But in more realistic scenarios that we are likely to actually encounter, I think it’s plausibly a lot better to accelerate AI, rather than delay AI, on current margins. This view simply makes sense to me given the enormously positive effects I expect AI will likely have on the people I currently know and love, if we allow development to continue.
I want to say thank you for holding the pole of these perspectives and keeping them in the dialogue. I think that they are important and it’s underappreciated in EA circles how plausible they are.
(I definitely don’t agree with everything you have here, but typically my view is somewhere between what you’ve expressed and what is commonly expressed in x-risk focused spaces. Often also I’m drawn to say “yeah, but …”—e.g. I agree that a treacherous turn is not so likely at global scale, but I don’t think it’s completely out of the question, and given that I think it’s worth serious attention safeguarding against.)
In particular, I am persuaded by the argument that, because evaluation is usually easier than generation, it should be feasible to accurately evaluate whether a slightly-smarter-than-human AI is taking unethical actions, allowing us to shape its rewards during training accordingly. After we’ve aligned a model that’s merely slightly smarter than humans, we can use it to help us align even smarter AIs, and so on, plausibly implying that alignment will scale to indefinitely higher levels of intelligence, without necessarily breaking down at any physically realistic point.
This reasoning seems to imply that you could use GPT-2 to oversee GPT-4 by bootstrapping from a chain of models of scales between GPT-2 and GPT-4. However, this isn’t true, the weak-to-strong generalization paper finds that this doesn’t work and indeed bootstrapping like this doesn’t help at all for ChatGPT reward modeling (it helps on chess puzzles and for nothing else they investigate I believe).
I think this sort of bootstrapping argument might work if we could ensure that the each model in the chain was sufficiently aligned and capable of reasoning that it would carefully reason about what humans would want if they were more knowledgeable and then rate outputs based on this. However, I don’t think GPT-4 is either aligned enough or capable enough that we see this behavior. And I still think it’s unlikely it works under these generous assumptions (though I won’t argue for this here).
In fact, it is difficult for me to name even a single technology that I think is currently underregulated by society.
The obvious example would be synthetic biology, gain-of-function research, and similar.
I also think AI itself is currently massively underregulated even entirely ignoring alignment difficulties. I think the probability of the creation of AI capable of accelerating AI R&D by 10x this year is around 3%. It would be extremely bad for US national interests if such an AI was stolen by foreign actors. This suffices for regulation ensuring very high levels of security IMO. And this is setting aside ongoing IP theft and similar issues.
Given how bird flu is progressing (spread in many cows, virologists believing rumors that humans are getting infected but no human-to-human spread yet), this would be a good time to start a protest movement for biosafety/against factory farming in the US.
virologists believing rumors that humans are getting infected
What are you referring to here?
We already have confirmation that it happened hundreds of times that people got infected with H5N1 from contact with animals (only 2 cases in the US so far, but one of them very recently). We can guess that there might be some percentage of unreported extra cases, but I’d expect that to be small because of the virus’s high mortality rate in its current form (and how much vigilance there is now).
So, I’m confused whether you’re referring to confirmed information with the word “rumors,” or whether there are rumors of some new development that’s meaningfully more concerning than what we already have confirmations of. (If so, I haven’t come across it – though “virus particles in milk” and things like that do seem concerning.)
Consider donating all or most of your Mana on Manifold to charity before May 1.
Manifold is making multiple changes to the way Manifold works.
You can read their announcement here.
The main reason for donating now is that Mana will be devalued from the current 1 USD:100 Mana to 1 USD:1000 Mana on May 1. Thankfully, the 10k USD/month charity cap will not be in place until then.
Also this part might be relevant for people with large positions they want to sell now:
One week may not be enough time for users with larger portfolios to liquidate and donate. We want to work individually with anyone who feels like they are stuck in this situation and honor their expected returns and agree on an amount they can donate at the original 100:1 rate past the one week deadline once the relevant markets have resolved.
Thanks for sharing this on the Forum! If you (the reader) have donated your mana because of this quick take, I’d love it if you put a react on this comment.
This is an extremely “EA” request from me but I feel like we need a word for people (i.e. me) who are Vegans but will eat animal products if they’re about to be thrown out. OpportuVegan? UtilaVegan?
If you predictably do this, you raise the odds that people around you will cook some/ buy some extra food so that it will be “thrown out”, or offer you food they haven’t quite finished (and that they’ll replace with a snack later. So I’d recommend going with “Vegan” as your label, for practical as well as signalling reasons.
I’m going to be leaving 80,000 Hours and joining Charity Entrepreneurship’s incubator programme this summer!
The summer 2023 incubator round is focused on biosecurity and scalable global health charities and I’m really excited to see what’s the best fit for me and hopefully launch a new charity. The ideas that the research team have written up look really exciting and I’m trepidatious about the challenge of being a founder but psyched for getting started. Watch this space! <3
I’ve been at 80,000 Hours for the last 3 years. I’m very proud of the 800+ advising calls I did and feel very privileged I got to talk to so many people and try and help them along their careers!
I’ve learned so much during my time at 80k. And the team at 80k has been wonderful to work with—so thoughtful, committed to working out what is the right thing to do, kind, and fun—I’ll for sure be sad to leave them.
There are a few main reasons why I’m leaving now:
New career challenge—I want to try out something that stretches my skills beyond what I’ve done before. I think I could be a good fit for being a founder and running something big and complicated and valuable that wouldn’t exist without me—I’d like to give it a try sooner rather than later.
Post-EA crises stepping away from EA community building a bit—Events over the last few months in EA made me re-evaluate how valuable I think the EA community and EA community building are as well as re-evaluate my personal relationship with EA. I haven’t gone to the last few EAGs and switched my work away from doing advising calls for the last few months, while processing all this. I have been somewhat sad that there hasn’t been more discussion and changes by now though I have been glad to see more EA leaders share things more recently (e.g. this from Ben Todd). I do still believe there are some really important ideas that EA prioritises but I’m more circumspect about some of the things I think we’re not doing as well as we could (e.g. Toby’s thoughts here and Holden’s caution about maximising here and things I’ve posted about myself). Overall, I’m personally keen to take a step away from EA meta at least for a bit and try and do something that helps people where the route to impact is more direct and doesn’t go via the EA community.
Less convinced of working on AI risk—Over the last year I’ve also become relatively less convinced about x-risk from AI—especially the case that agentic deceptive strategically-aware power-seeking AI is likely. I’m fairly convinced by the counterarguments e.g. this and this and I’m worried at the meta level about the quality of reasoning and discourse e.g. this. Though I’m still worried about a whole host of non-x-risk dangers from advanced AI. That makes me much more excited to work on something bio or global health related.
So overall it seems like it was good to move on to something new and it took me a little while to find something I was as excited about as CE’s incubator programme!
I’ll be at EAG London this weekend! And will hopefully you’ll hear more from me later this year about the new thing I’m working on—so keep an eye out as no doubt I’ll be fundraising and/or hiring at some point! :)
I am really happy that you are sticking to your principles and what you think is right and enjoyable. I can’t wait to see if you found an organisation, what it’s like and thank you so much for all the work you’ve done.
An alternate stance on moderation (from @Habryka.)
This is from this comment responding to this post about there being too many bans on LessWrong. Note how the LessWrong is less moderated than here in that it (I guess) responds to individual posts less often, but more moderated in that I guess it rate limits people more without reason.
I found it thought provoking. I’d recommend reading it.
Thanks for making this post!
One of the reasons why I like rate-limits instead of bans is that it allows people to complain about the rate-limiting and to participate in discussion on their own posts (so seeing a harsh rate-limit of something like “1 comment per 3 days” is not equivalent to a general ban from LessWrong, but should be more interpreted as “please comment primarily on your own posts”, though of course it shares many important properties of a ban).
This is a pretty opposite approach to the EA forum which favours bans.
Things that seem most important to bring up in terms of moderation philosophy:
Moderation on LessWrong does not depend on effort
“Another thing I’ve noticed is that almost all the users are trying. They are trying to use rationality, trying to understand what’s been written here, trying to apply Baye’s rule or understand AI. Even some of the users with negative karma are trying, just having more difficulty.”
Just because someone is genuinely trying to contribute to LessWrong, does not mean LessWrong is a good place for them. LessWrong has a particular culture, with particular standards and particular interests, and I think many people, even if they are genuinely trying, don’t fit well within that culture and those standards.
In making rate-limiting decisions like this I don’t pay much attention to whether the user in question is “genuinely trying ” to contribute to LW, I am mostly just evaluating the effects I see their actions having on the quality of the discussions happening on the site, and the quality of the ideas they are contributing.
Motivation and goals are of course a relevant component to model, but that mostly pushes in the opposite direction, in that if I have someone who seems to be making great contributions, and I learn they aren’t even trying, then that makes me more excited, since there is upside if they do become more motivated in the future.
I sense this is quite different to the EA forum too. I can’t imagine a mod saying I don’t pay much attention to whether the user in question is “genuinely trying”. I find this honesty pretty stark. Feels like a thing moderators aren’t allowed to say. “We don’t like the quality of your comments and we don’t think you can improve”.
Signal to Noise ratio is important
Thomas and Elizabeth pointed this out already, but just because someone’s comments don’t seem actively bad, doesn’t mean I don’t want to limit their ability to contribute. We do a lot of things on LW to improve the signal to noise ratio of content on the site, and one of those things is to reduce the amount of noise, even if the mean of what we remove looks not actively harmful.
We of course also do other things than to remove some of the lower signal content to improve the signal to noise ratio. Voting does a lot, how we sort the frontpage does a lot, subscriptions and notification systems do a lot. But rate-limiting is also a tool I use for the same purpose.
Old users are owed explanations, new users are (mostly) not
I think if you’ve been around for a while on LessWrong, and I decide to rate-limit you, then I think it makes sense for me to make some time to argue with you about that, and give you the opportunity to convince me that I am wrong. But if you are new, and haven’t invested a lot in the site, then I think I owe you relatively little.
I think in doing the above rate-limits, we did not do enough to give established users the affordance to push back and argue with us about them. I do think most of these users are relatively recent or are users we’ve been very straightforward with since shortly after they started commenting that we don’t think they are breaking even on their contributions to the site (like the OP Gerald Monroe, with whom we had 3 separate conversations over the past few months), and for those I don’t think we owe them much of an explanation. LessWrong is a walled garden.
You do not by default have the right to be here, and I don’t want to, and cannot, accept the burden of explaining to everyone who wants to be here but who I don’t want here, why I am making my decisions. As such a moderation principle that we’ve been aspiring to for quite a while is to let new users know as early as possible if we think them being on the site is unlikely to work out, so that if you have been around for a while you can feel stable, and also so that you don’t invest in something that will end up being taken away from you.
Feedback helps a bit, especially if you are young, but usually doesn’t
Maybe there are other people who are much better at giving feedback and helping people grow as commenters, but my personal experience is that giving users feedback, especially the second or third time, rarely tends to substantially improve things.
I think this sucks. I would much rather be in a world where the usual reasons why I think someone isn’t positively contributing to LessWrong were of the type that a short conversation could clear up and fix, but it alas does not appear so, and after having spent many hundreds of hours over the years giving people individualized feedback, I don’t really think “give people specific and detailed feedback” is a viable moderation strategy, at least more than once or twice per user. I recognize that this can feel unfair on the receiving end, and I also feel sad about it.
I do think the one exception here is that if people are young or are non-native english speakers. Do let me know if you are in your teens or you are a non-native english speaker who is still learning the language. People do really get a lot better at communication between the ages of 14-22 and people’s english does get substantially better over time, and this helps with all kinds communication issues.
Again this is very blunt but I’m not sure it’s wrong.
We consider legibility, but its only a relatively small input into our moderation decisions
It is valuable and a precious public good to make it easy to know which actions you take will cause you to end up being removed from a space. However, that legibility also comes at great cost, especially in social contexts. Every clear and bright-line rule you outline will have people budding right up against it, and de-facto, in my experience, moderation of social spaces like LessWrong is not the kind of thing you can do while being legible in the way that for example modern courts aim to be legible.
As such, we don’t have laws. If anything we have something like case-law which gets established as individual moderation disputes arise, which we then use as guidelines for future decisions, but also a huge fraction of our moderation decisions are downstream of complicated models we formed about what kind of conversations and interactions work on LessWrong, and what role we want LessWrong to play in the broader world, and those shift and change as new evidence comes in and the world changes.
I do ultimately still try pretty hard to give people guidelines and to draw lines that help people feel secure in their relationship to LessWrong, and I care a lot about this, but at the end of the day I will still make many from-the-outside-arbitrary-seeming-decisions in order to keep LessWrong the precious walled garden that it is.
I try really hard to not build an ideological echo chamber
When making moderation decisions, it’s always at the top of my mind whether I am tempted to make a decision one way or another because they disagree with me on some object-level issue. I try pretty hard to not have that affect my decisions, and as a result have what feels to me a subjectively substantially higher standard for rate-limiting or banning people who disagree with me, than for people who agree with me. I think this is reflected in the decisions above.
I do feel comfortable judging people on the methodologies and abstract principles that they seem to use to arrive at their conclusions. LessWrong has a specific epistemology, and I care about protecting that. If you are primarily trying to…
argue from authority,
don’t like speaking in probabilistic terms,
aren’t comfortable holding multiple conflicting models in your head at the same time,
or are averse to breaking things down into mechanistic and reductionist terms,
then LW is probably not for you, and I feel fine with that. I feel comfortable reducing the visibility or volume of content on the site that is in conflict with these epistemological principles (of course this list isn’t exhaustive, in-general the LW sequences are the best pointer towards the epistemological foundations of the site).
It feels cringe to read that basically if I don’t get the sequences lessWrong might rate limit me. But it is good to be open about it. I don’t think the EA forum’s core philosophy is as easily expressed.
If you see me or other LW moderators fail to judge people on epistemological principles but instead see us directly rate-limiting or banning users on the basis of object-level opinions that even if they seem wrong seem to have been arrived at via relatively sane principles, then I do really think you should complain and push back at us. I see my mandate as head of LW to only extend towards enforcing what seems to me the shared epistemological foundation of LW, and to not have the mandate to enforce my own object-level beliefs on the participants of this site.
Now some more comments on the object-level:
I overall feel good about rate-limiting everyone on the above list. I think it will probably make the conversations on the site go better and make more people contribute to the site.
Us doing more extensive rate-limiting is an experiment, and we will see how it goes. As kave said in the other response to this post, the rule that suggested these specific rate-limits does not seem like it has an amazing track record, though I currently endorse it as something that calls things to my attention (among many other heuristics).
Also, if anyone reading this is worried about being rate-limited or banned in the future, feel free to reach out to me or other moderators on Intercom. I am generally happy to give people direct and frank feedback about their contributions to the site, as well as how likely I am to take future moderator actions. Uncertainty is costly, and I think it’s worth a lot of my time to help people understand to what degree investing in LessWrong makes sense for them.
Status note: This comment is written by me and reflects my views. I ran it past the other moderators, but they might have major disagreements with it.
I agree with a lot of Jason’s view here. The EA community is indeed much bigger than the EA Forum, and the Forum would serve its role as an online locus much less well if we used moderation action to police the epistemic practices of its participants.
I don’t actually think this that bad. I think it is a strength of the EA community that it is large enough and has sufficiently many worldviews that any central discussion space is going to be a bit of a mishmash of epistemologies.[1]
Some corresponding ways this viewpoint causes me to be reluctant to apply Habryka’s philosophy:[2]
Something like a judicial process is much more important to me. We try much harder than my read of LessWrong to apply rules consistently. We have the Forum Norms doc and our public history of cases forms something much closer to a legal code + case law than LW has. Obviously we’re far away from what would meet a judicial standard, but I view much of my work through that lens. Also notable is that all nontrivial moderation decisions get one or two moderators to second the proposal.
Related both to the epistemic diversity, and the above, I am much more reluctant to rely on my personal judgement about whether someone is a positive contributor to the discussion. I still do have those opinions, but am much more likely to use my power as a regular user to karma-vote on the content.
Some points of agreement:
Old users are owed explanations, new users are (mostly) not
Agreed. We are much more likely to make judgement calls in cases of new users. And much less likely to invest time in explaining the decision. We are still much less likely to ban new users than LessWrong. (Which, to be clear, I don’t think would have been tenable on LessWrong when they instituted their current policies, which was after the launch of GPT-4 and a giant influx of low quality content.)
I try really hard to not build an ideological echo chamber
Most of the work I do as a moderator is reading reports and recommending no official action. I have the internal experience of mostly fighting others to keep the Forum an open platform. Obviously that is a compatible experience with overmoderating the Forum into an echo chamber, but I will at least bring this up as a strong point of philosophical agreement.
Final points:
I do think we could potentially give more “near-ban” rate limits, such as the 1 comment/3 days. The main benefit of this I see is as allowing the user to write content disagreeing with their ban.
Controversial point! Maybe if everyone adopted my own epistemic practices the community would be better off. It would certainly gain in the ability to communicate smoothly with itself, and would probably spend less effort pulling in opposite directions as a result, but I think the size constraints and/or deference to authority that would be required would not be worth it.
I do think we could potentially give more “near-ban” rate limits, such as the 1 comment/3 days. The main benefit of this I see is as allowing the user to write content disagreeing with their ban.
I think the banned individual should almost always get at least one final statement to disagree with the ban after its pronouncement. Even the Romulans allowed (will allow?) that. Absent unusual circumstances, I think they—and not the mods—should get the last word, so I would also allow a single reply if the mods responded to the final statement.
More generally, I’d be interested in ~”civility probation,” under which a problematic poster could be placed for ~three months as an option they could choose as an alternative to a 2-4 week outright ban. Under civility probation, any “probation officer” (trusted non-mod users) would be empowered to remove content too close to the civility line and optionally temp-ban the user for a cooling-off period of 48 hours. The theory of impact comes from the criminology literature, which tells us that speed and certainty of sanction are more effective than severity. If the mods later determined after full deliberation that the second comment actually violated the rules in a way that crossed the action threshold, then they could activate the withheld 2-4 week ban for the first offense and/or impose a new suspension for the new one.
We are seeing more of this in the criminal system—swift but moderate “intermediate sanctions” for things like failing a drug test, as opposed to doing little about probation violations until things reach a certain threshold and then going to the judge to revoke probation and send the offender away for at least several months. As far as due process, the theory is that the offender received their due process (consideration by a judge, right to presumption of innocence overcome only by proof beyond a reasonable doubt) in the proceedings that led to the imposition of probation in the first place.
This is a pretty opposite approach to the EA forum which favours bans.
If you remove ones for site-integrity reasons (spamming DMs, ban evasion, vote manipulation), bans are fairly uncommon. In contrast, it sounds like LW does do some bans of early-stage users (cf. the disclaimer on this list), which could be cutting off users with a high risk of problematic behavior before it fully blossoms. Reading further, it seems like the stuff that triggers a rate limit at LW usually triggers no action, private counseling, or downvoting here.
As for more general moderation philosophy, I think the EA Forum has an unusual relationship to the broader EA community that makes the moderation approach outlined above a significantly worse fit for the Forum than for LW. As a practical matter, the Forum is the ~semi-official forum for the effective altruism movement. Organizations post official announcements here as a primary means of publishing them, but rarely on (say) the effectivealtruism subreddit. Posting certain content here is seen as a way of whistleblowing to the broader community as a whole. Major decisionmakers are known to read and even participate in the Forum.
In contrast (although I am not an LW user or a member of the broader rationality community), it seems to me that the LW forum doesn’t have this particular relationship to a real-world community. One could say that the LW forum is the official online instantiation of the LessWrong community (which is not limited to being an online community, but that’s a major part of it). In that case, we have something somewhat like the (made-up) Roman Catholic Forum (RCF) that is moderated by designees of the Pope. Since the Pope is the authoritative source on what makes something legitimately Roman Catholic, it’s appropriate for his designees to employ a heavier hand in deciding what posts and posters are in or out of bounds at the RCF. But CEA/EVF have—rightfully—mostly disowned any idea that they (or any other specific entity) decide what is or isn’t a valid or correct way to practice effective altruism.
One could also say that the LW forum is an online instantiation of the broader rationality community. That would be somewhat akin to John and Jane’s (made up) Baptist Forum (JJBF) that is moderated by John and Jane. One of the core tenets of Baptist polity is that there are no centralized, authoritative arbiters of faith and practice. So JJBF is just one of many places that Baptists and their critics can go to discuss Baptist topics. It’s appropriate for John and Jane to to employ a heavier hand in deciding what posts and posters are in or out of bounds at the JJBF because there are plenty of other, similar places for them to go. JJBF isn’t anything special. But as noted above, that isn’t really true of the EA Forum because of its ~semi-official status in a real-world social movement.
It’s ironic that—in my mind—either a broader or narrower conception of what LW is would justify tighter content-based moderation practices, while those are harder to justify in the in-between place that the EA Forum occupies. I think the mods here do a good job handling this awkward place for the most part by enforcing viewpoint-neutral rules like civility and letting the community manage most things through the semi-democratic karma method (although I would be somewhat more willing to remove certain content than they are).
This also roughly matches my impression. I do think I would prefer the EA community to either go towards more centralized governance or less centralized governance in the relevant way, but I agree that given how things are, the EA Forum team has less leeway with moderation than the LW team.
Ben West recently mentioned that he would be excited about a common application. It got me thinking a little about it. I don’t have the technical/design skills to create such a system, but I want to let my mind wander a little bit on the topic. This is just musings and ‘thinking out out,’ so don’t take any of this too seriously.
What would the benefits be for some type of common application? For the applicant: send an application to a wider variety of organizations with less effort. For the organization: get a wider variety of applicants.
Why not just have the post openings posted to LinkedIn and allow candidates to use the Easy Apply function? Well, that would probably result in lots of low quality applications. Maybe include a few question to serve as a simple filter? Perhaps a question to reveal how familiar the candidate is with the ideas and principles of EA? Lots of low quality applications aren’t really an issue if you have an easy way to filter them out. As a simplistic example, if I am hiring for a job that requires fluent Spanish, and a dropdown prompt in the job application asks candidates to evaluate their Spanish, it is pretty easy to filter out people that selected “I don’t speak any Spanish” or “I speak a little Spanish, but not much.”
But the benefit of Easy Apply (from the candidate’s perspective) is the ease. John Doe candidate doesn’t have to fill in a dozen different text boxes with information that is already on his resume. And that ease can be gained in an organization’s own application form. An application form literally can be as simple as prompts for name, email address, and resume. That might be the most minimalistic that an application form could be while still being functional. And there are plenty of organizations that have these types of applications: companies that use Lever or Ashby often have very simple and easy job application forms (example 1, example 2).
Conversely, the more than organizations prompt candidates to explain “Why do you want to work for us” or “tell us about your most impressive accomplishment” the more burdensome it is for candidates. Of course, maybe making it burdensome for candidates is intentional, and the organization believes that this will lead to higher quality candidates. There are some things that you can’t really get information about by prompting candidates to select an item from a list.
With the US presidential election coming up this year, some of y’all will probably want to discuss it.[1] I think it’s a good time to restate our politics policy. tl;dr Partisan politics content is allowed, but will be restricted to the Personal Blog category. On-topic policy discussions are still eligible as frontpage material.
I don’t think we have a good answer to what happens after we do auditing of an AI model and find something wrong.
Given that our current understanding of AI’s internal workings is at least a generation behind, it’s not exactly like we can isolate what mechanism is causing certain behaviours. (Would really appreciate any input here- I see very little to no discussion on this in governance papers; it’s almost as if policy folks are oblivious to the technical hurdles which await working groups)
Dustin Moskovitz claims “Tesla has committed consumer fraud on a massive scale”, and “people are going to jail at the end”
https://www.threads.net/@moskov/post/C6KW_Odvky0/
Not super EA relevant, but I guess relevant inasmuch as Moskovitz funds us and Musk has in the past too. I think if this were just some random commentator I wouldn’t take it seriously at all, but a bit more inclined to believe Dustin will take some concrete action. Not sure I’ve read everything he’s said about it, I’m not used to how Threads works
The “non-tweet” feels vague and unsubsantiated (at this point anyway). I hope we’ll get a full article and explanation as to what he means exactly because obviously he’s making HUGE calls.
Very few of my peers are having kids. My husband and I are the youngest parents at the Princeton University daycare at 31 years old. The next youngest parent is 3 years older than us, and his kid is a year younger than ours. Considering median age of first birth at the national level is 30 years old, it seems like a potential problem that the national median is the Princeton minimum.
I wonder what the birth rate is specifically among American parents with/doing STEM PhDs. I’m guessing it’s extremely low for people under the age of 45. Possibly low enough to raise concerns about how scientists are not procreating anymore.
Most birth rate statistics I’ve seen group doctorates in with any professional degree other than a masters, so it’s hard to tell what’s going on outside anecdotal evidence. For example: https://www.cdc.gov/nchs/data/nvsr/nvsr70/nvsr70-05-508.pdf
Princeton is raising annual stipends to about $45,000. Two graduate student parents now have a reasonable combined household income, especially if they can live in subsidized student housing. I wonder if this will make a big difference in Princeton fertility rates.
On the other hand, none of my NYC friends making way over $90,000 have kids, so this might be a deeper cultural problem.
To be clear, I don’t think people who don’t want to have kids should have them, or that they’re being “selfish” or whatever. But societies without children will literally die, so it’s concerning that American society has such strong anti-natal sentiment. Especially if it’s the part of American society with some of the smartest people who are more motivated by truth seeking than money.
Some of this seems to be inherent to a modern society (High birth rates in past society were because of high mortality rates, women being treated as baby factories, etc.), but in my own experience the reason the birth rate is so low is that people simply can’t afford to have children.
In Japan and South Korea, the “salaryman culture” is such that employees are expected to devote their entire lives to their employers, to the extent of sleeping in the office at times. Needless to say, this makes it extremely difficult to have a relationship.
In short, wealth inequality and a society that’s entirely focused on the generation of profit will both cause catastrophically low birth rates. I may be biased here, but then again it’s exactly these situations that convinced me that our current economic system has outlived its usefulness.
I think we’re still the youngest parents at daycare, a year and a half after I initially posted this.
CNN reporting US fertility rates dropping to “lowest in a century”. Seems bad: https://www.cnn.com/2024/04/24/health/us-birth-rate-decline-2023-cdc/index.html
One (probably awful) idea I’ve been playing around with is scaling up parenting.
Say, find some good people (maybe couples) who care about education and love raising kids, and fund them to raise a lot of kids with strong genetic potential.
There may be ways to raise them to be great people (e.g. this Future Perfect piece) and with devoted parenting it might be possible to raise them to be “expert do-gooders” (thinking of the Polgar sisters).
A corporation exhibits emergent behavior, over which no individual employee has full control. Because the unregulated market selects for profit and nothing else, any successful corporation becomes a kind of “financial paperclip optimizer”. To prevent this, the economic system must change.
Paul Graham about getting good at technology (bold is mine):
From “HOW TO START GOOGLE”, March 2024. It’s a talk for ~15 year olds, and it has more about “how to get good at technology” in it.
Everyone who seems to be writing policy papers/ doing technical work seems to be keeping generative AI at the back of their mind, when framing their work or impact.
This narrow-eyed focus on gen AI might almost certainly be net-negative for us- unknowingly or unintentionally ignoring ripple effects of the gen AI boom in other fields (like robotics companies getting more funding leading to more capabilities, and that leads to new types of risks).
And guess who benefits if we do end up getting good evals/standards in place for gen AI? It seems to me companies/investors are clear winners because we have to go back to the drawing board and now advocate for the same kind of stuff for robotics or a different kind of AI use-case/type all while the development/capability cycles keep maturing.
We seem to be in whack-a-mole territory now because of the overton window shifting for investors.
This WHO press release was a good reminder of the power of immunization – a new study forthcoming publication in The Lancet reports that (liberally quoting / paraphrasing the release)
global immunization efforts have saved an estimated 154 million lives over the past 50 years, 146 million of them children under 5 and 101 million of them infants
for each life saved through immunization, an average of 66 years of full health were gained – with a total of 10.2 billion full health years gained over the five decades
measles vaccination accounted for 60% of the lives saved due to immunization, and will likely remain the top contributor in the future
vaccination against 14 diseases has directly contributed to reducing infant deaths by 40% globally, and by more than 50% in the African Region
the 14 diseases: diphtheria, Haemophilus influenzae type B, hepatitis B, Japanese encephalitis, measles, meningitis A, pertussis, invasive pneumococcal disease, polio, rotavirus, rubella, tetanus, tuberculosis, and yellow fever
fewer than 5% of infants globally had access to routine immunization when the Expanded Programme on Immunization (EPI) was launched 50 years ago in 1974 by the World Health Assembly; today 84% of infants are protected with 3 doses of the vaccine against diphtheria, tetanus and pertussis (DTP) – the global marker for immunization coverage
there’s still a lot to be done – for instance, 67 million children missed out on one or more vaccines during the pandemic years
American Philosophical Association (APA) announces two $10,000 AI2050 Prizes for philosophical work related to AI, with June 23, 2024 deadline:
https://dailynous.com/2024/04/25/apa-creates-new-prizes-for-philosophical-research-on-ai/
https://www.apaonline.org/page/ai2050
https://ai2050.schmidtsciences.org/hard-problems/
First in-ovo sexing in the US
Egg Innovations announced that they are “on track to adopt the technology in early 2025.” Approximately 300 million male chicks are ground up alive in the US each year (since only female chicks are valuable) and in-ovo sexing would prevent this.
UEP originally promised to eliminate male chick culling by 2020; needless to say, they didn’t keep that commitment. But better late than never!
Congrats to everyone working on this, including @Robert—Innovate Animal Ag, who founded an organization devoted to pushing this technology.[1]
Egg Innovations says they can’t disclose details about who they are working with for NDA reasons; if anyone has more information about who deserves credit for this, please comment!
How many chicks per year will Egg Innovations’ change save? (The announcement link is blocked for me.)
Wow this is wonderful news.
In order to be able to communicate about malaria from a fundraising perspective, it would be amazing if there would be a documentary about malaria. Personal compelling stories that anyone can relate to. Not about the science behind the disease, as that wouldn’t work probably. Just like “An inconvenient truth”, but then around Malaria. I am truly baffled I can’t find anything close to what I was hoping would exist already. Anyone knows why this is? Or am I googling wrong?
In this “quick take”, I want to summarize some my idiosyncratic views on AI risk.
My goal here is to list just a few ideas that cause me to approach the subject differently from how I perceive most other EAs view the topic. These ideas largely push me in the direction of making me more optimistic about AI, and less likely to support heavy regulations on AI.
(Note that I won’t spend a lot of time justifying each of these views here. I’m mostly stating these points without lengthy justifications, in case anyone is curious. These ideas can perhaps inform why I spend significant amounts of my time pushing back against AI risk arguments. Not all of these ideas are rare, and some of them may indeed be popular among EAs.)
Skepticism of the treacherous turn: The treacherous turn is the idea that (1) at some point there will be a very smart unaligned AI, (2) when weak, this AI will pretend to be nice, but (3) when sufficiently strong, this AI will turn on humanity by taking over the world by surprise, and then (4) optimize the universe without constraint, which would be very bad for humans.
By comparison, I find it more likely that no individual AI will ever be strong enough to take over the world, in the sense of overthrowing the world’s existing institutions and governments by surprise. Instead, I broadly expect unaligned AIs will integrate into society and try to accomplish their goals by advocating for their legal rights, rather than trying to overthrow our institutions by force. Upon attaining legal personhood, unaligned AIs can utilize their legal rights to achieve their objectives, for example by getting a job and trading their labor for property, within the already-existing institutions. Because the world is not zero sum, and there are economic benefits to scale and specialization, this argument implies that unaligned AIs may well have a net-positive effect on humans, as they could trade with us, producing value in exchange for our own property and services.
Note that my claim here is not that AIs will never become smarter than humans. One way of seeing how these two claims are distinguished is to compare my scenario to the case of genetically engineered humans. By assumption, if we genetically engineered humans, they would presumably eventually surpass ordinary humans in intelligence (along with social persuasion ability, and ability to deceive etc.). However, by itself, the fact that genetically engineered humans will become smarter than non-engineered humans does not imply that genetically engineered humans would try to overthrow the government. Instead, as in the case of AIs, I expect genetically engineered humans would largely try to work within existing institutions, rather than violently overthrow them.
AI alignment will probably be somewhat easy: The most direct and strongest current empirical evidence we have about the difficulty of AI alignment, in my view, comes from existing frontier LLMs, such as GPT-4. Having spent dozens of hours testing GPT-4′s abilities and moral reasoning, I think the system is already substantially more law-abiding, thoughtful and ethical than a large fraction of humans. Most importantly, this ethical reasoning extends (in my experience) to highly unusual thought experiments that almost certainly did not appear in its training data, demonstrating a fair degree of ethical generalization, beyond mere memorization.
It is conceivable that GPT-4′s apparently ethical nature is fake. Perhaps GPT-4 is lying about its motives to me and in fact desires something completely different than what it professes to care about. Maybe GPT-4 merely “understands” or “predicts” human morality without actually “caring” about human morality. But while these scenarios are logically possible, they seem less plausible to me than the simple alternative explanation that alignment—like many other properties of ML models—generalizes well, in the natural way that you might similarly expect from a human.
Of course, the fact that GPT-4 is easily alignable does not immediately imply that smarter-than-human AIs will be easy to align. However, I think this current evidence is still significant, and aligns well with prior theoretical arguments that alignment would be easy. In particular, I am persuaded by the argument that, because evaluation is usually easier than generation, it should be feasible to accurately evaluate whether a slightly-smarter-than-human AI is taking bad actions, allowing us to shape its rewards during training accordingly. After we’ve aligned a model that’s merely slightly smarter than humans, we can use it to help us align even smarter AIs, and so on, plausibly implying that alignment will scale to indefinitely higher levels of intelligence, without necessarily breaking down at any physically realistic point.
The default social response to AI will likely be strong: One reason to support heavy regulations on AI right now is if you think the natural “default” social response to AI will lean too heavily on the side of laissez faire than optimal, i.e., by default, we will have too little regulation rather than too much. In this case, you could believe that, by advocating for regulations now, you’re making it more likely that we regulate AI a bit more than we otherwise would have, pushing us closer to the optimal level of regulation.
I’m quite skeptical of this argument because I think that the default response to AI (in the absence of intervention from the EA community) will already be quite strong. My view here is informed by the base rate of technologies being overregulated, which I think is quite high. In fact, it is difficult for me to name even a single technology that I think is currently clearly underregulated by society. By pushing for more regulation on AI, I think it’s likely that we will overshoot and over-constrain AI relative to the optimal level.
In other words, my personal bias is towards thinking that society will regulate technologies too heavily, rather than too loosely. And I don’t see a strong reason to think that AI will be any different from this general historical pattern. This makes me hesitant to push for more regulation on AI, since on my view, the marginal impact of my advocacy would likely be to push us even further in the direction of “too much regulation”, overshooting the optimal level by even more than what I’d expect in the absence of my advocacy.
I view unaligned AIs as having comparable moral value to humans: This idea was explored in one of my most recent posts. The basic idea is that, under various physicalist views of consciousness, you should expect AIs to be conscious, even if they do not share human preferences. Moreover, it seems likely that AIs — even ones that don’t share human preferences — will be pretrained on human data, and therefore largely share our social and moral concepts.
Since unaligned AIs will likely be both conscious and share human social and moral concepts, I don’t see much reason to think of them as less “deserving” of life and liberty, from a cosmopolitan moral perspective. They will likely think similarly to the way we do across a variety of relevant axes, even if their neural structures are quite different from our own. As a consequence, I am pretty happy to incorporate unaligned AIs into the legal system and grant them some control of the future, just as I’d be happy to grant some control of the future to human children, even if they don’t share my exact values.
Put another way, I view (what I perceive as) the EA attempt to privilege “human values” over “AI values” as being largely arbitrary and baseless, from an impartial moral perspective. There are many humans whose values I vehemently disagree with, but I nonetheless respect their autonomy, and do not wish to deny these humans their legal rights. Likewise, even if I strongly disagreed with the values of an advanced AI, I would still see value in their preferences being satisfied for their own sake, and I would try to respect the AI’s autonomy and legal rights. I don’t have a lot of faith in the inherent kindness of human nature relative to a “default unaligned” AI alternative.
I’m not fully committed to longtermism: I think AI has an enormous potential to benefit the lives of people who currently exist. I predict that AIs can eventually substitute for human researchers, and thereby accelerate technological progress, including in medicine. In combination with my other beliefs (such as my belief that AI alignment will probably be somewhat easy), this view leads me to think that AI development will likely be net-positive for people who exist at the time of alignment. In other words, if we allow AI development, it is likely that we can use AI to reduce human mortality, and dramatically raise human well-being for the people who already exist.
I think these benefits are large and important, and commensurate with the downside potential of existential risks. While a fully committed strong longtermist might scoff at the idea that curing aging might be important — as it would largely only have short-term effects, rather than long-term effects that reverberate for billions of years — by contrast, I think it’s really important to try to improve the lives of people who currently exist. Many people view this perspective as a form of moral partiality that we should discard for being arbitrary. However, I think morality is itself arbitrary: it can be anything we want it to be. And I choose to value currently existing humans, to a substantial (though not overwhelming) degree.
This doesn’t mean I’m a fully committed near-termist. I sympathize with many of the intuitions behind longtermism. For example, if curing aging required raising the probability of human extinction by 40 percentage points, or something like that, I don’t think I’d do it. But in more realistic scenarios that we are likely to actually encounter, I think it’s plausibly a lot better to accelerate AI, rather than delay AI, on current margins. This view simply makes sense to me given the enormously positive effects I expect AI will likely have on the people I currently know and love, if we allow development to continue.
I want to say thank you for holding the pole of these perspectives and keeping them in the dialogue. I think that they are important and it’s underappreciated in EA circles how plausible they are.
(I definitely don’t agree with everything you have here, but typically my view is somewhere between what you’ve expressed and what is commonly expressed in x-risk focused spaces. Often also I’m drawn to say “yeah, but …”—e.g. I agree that a treacherous turn is not so likely at global scale, but I don’t think it’s completely out of the question, and given that I think it’s worth serious attention safeguarding against.)
Explicit +1 to what Owen is saying here.
(Given that I commented with some counterarguments, I thought I would explicitly note my +1 here.)
This reasoning seems to imply that you could use GPT-2 to oversee GPT-4 by bootstrapping from a chain of models of scales between GPT-2 and GPT-4. However, this isn’t true, the weak-to-strong generalization paper finds that this doesn’t work and indeed bootstrapping like this doesn’t help at all for ChatGPT reward modeling (it helps on chess puzzles and for nothing else they investigate I believe).
I think this sort of bootstrapping argument might work if we could ensure that the each model in the chain was sufficiently aligned and capable of reasoning that it would carefully reason about what humans would want if they were more knowledgeable and then rate outputs based on this. However, I don’t think GPT-4 is either aligned enough or capable enough that we see this behavior. And I still think it’s unlikely it works under these generous assumptions (though I won’t argue for this here).
The obvious example would be synthetic biology, gain-of-function research, and similar.
I also think AI itself is currently massively underregulated even entirely ignoring alignment difficulties. I think the probability of the creation of AI capable of accelerating AI R&D by 10x this year is around 3%. It would be extremely bad for US national interests if such an AI was stolen by foreign actors. This suffices for regulation ensuring very high levels of security IMO. And this is setting aside ongoing IP theft and similar issues.
Can you explain why you suspect these things should be more regulated than they currently are?
My recommended readings/resources for community builders/organisers
CEA’s groups resource centre, naturally
This handbook on community organising
High Output Management by Andrew Groves
How to Launch a High-Impact Nonprofit
LifeLabs’s coaching questions (great for 1-1s with organisers you’re supporting/career coachees)
The 2-Hour Cocktail Party
Centola’s work on social change, e.g., the book Change: How to Make Big Things Happen
Han’s work on organising, e.g., How Organisations Develop Activists (I wrote up some notes here)
This 80k article on community coordination
@Michael Noetel’s forum post - ‘We all teach: here’s how to do it better’
Theory of change in ten steps
Rumelt’s Good Strategy Bad Strategy
IDinsight’s Impact Measurement Guide
Given how bird flu is progressing (spread in many cows, virologists believing rumors that humans are getting infected but no human-to-human spread yet), this would be a good time to start a protest movement for biosafety/against factory farming in the US.
What are you referring to here?
We already have confirmation that it happened hundreds of times that people got infected with H5N1 from contact with animals (only 2 cases in the US so far, but one of them very recently). We can guess that there might be some percentage of unreported extra cases, but I’d expect that to be small because of the virus’s high mortality rate in its current form (and how much vigilance there is now).
So, I’m confused whether you’re referring to confirmed information with the word “rumors,” or whether there are rumors of some new development that’s meaningfully more concerning than what we already have confirmations of. (If so, I haven’t come across it – though “virus particles in milk” and things like that do seem concerning.)
Consider donating all or most of your Mana on Manifold to charity before May 1.
Manifold is making multiple changes to the way Manifold works. You can read their announcement here. The main reason for donating now is that Mana will be devalued from the current 1 USD:100 Mana to 1 USD:1000 Mana on May 1. Thankfully, the 10k USD/month charity cap will not be in place until then.
Also this part might be relevant for people with large positions they want to sell now:
Forum post saying the same thing, with some discussion: https://forum.effectivealtruism.org/posts/SM3YzTsXmQ6BaFcsL/you-probably-want-to-donate-any-manifold-currency-this-week
Thanks for sharing this on the Forum!
If you (the reader) have donated your mana because of this quick take, I’d love it if you put a react on this comment.
I just donated $65 to Shrimp Welfare Project :)
Sadly even slightly worse than 10x devaluation because 1,000 mana will redeem for $0.95 to cover “credit card fees and administrative work”
That Notion link doesn’t work for me FYI :) But this one did (from their website)
Vaccines saved 150M+ lives over the past 50 years, including 100M+ infants and nearly 100M lives from Measles alone:
https://www.gavi.org/vaccineswork/new-data-shows-vaccines-have-saved-154-million-lives-past-50-years
https://www.who.int/news/item/24-04-2024-global-immunization-efforts-have-saved-at-least-154-million-lives-over-the-past-50-years
This is an extremely “EA” request from me but I feel like we need a word for people (i.e. me) who are Vegans but will eat animal products if they’re about to be thrown out. OpportuVegan? UtilaVegan?
Freegan
If you predictably do this, you raise the odds that people around you will cook some/ buy some extra food so that it will be “thrown out”, or offer you food they haven’t quite finished (and that they’ll replace with a snack later.
So I’d recommend going with “Vegan” as your label, for practical as well as signalling reasons.
Yeah this is a good point, which I’ve considered, which is why I basically only do it at home.
I think the term I’ve heard (from non-EAs) is ‘freegan’ (they’ll eat it if it didn’t cause more animal products to be purchased!)
This seems close enough that I might co-opt it :)
https://en.wikipedia.org/wiki/Freeganism
I’m going to be leaving 80,000 Hours and joining Charity Entrepreneurship’s incubator programme this summer!
The summer 2023 incubator round is focused on biosecurity and scalable global health charities and I’m really excited to see what’s the best fit for me and hopefully launch a new charity. The ideas that the research team have written up look really exciting and I’m trepidatious about the challenge of being a founder but psyched for getting started. Watch this space! <3
I’ve been at 80,000 Hours for the last 3 years. I’m very proud of the 800+ advising calls I did and feel very privileged I got to talk to so many people and try and help them along their careers!
I’ve learned so much during my time at 80k. And the team at 80k has been wonderful to work with—so thoughtful, committed to working out what is the right thing to do, kind, and fun—I’ll for sure be sad to leave them.
There are a few main reasons why I’m leaving now:
New career challenge—I want to try out something that stretches my skills beyond what I’ve done before. I think I could be a good fit for being a founder and running something big and complicated and valuable that wouldn’t exist without me—I’d like to give it a try sooner rather than later.
Post-EA crises stepping away from EA community building a bit—Events over the last few months in EA made me re-evaluate how valuable I think the EA community and EA community building are as well as re-evaluate my personal relationship with EA. I haven’t gone to the last few EAGs and switched my work away from doing advising calls for the last few months, while processing all this. I have been somewhat sad that there hasn’t been more discussion and changes by now though I have been glad to see more EA leaders share things more recently (e.g. this from Ben Todd). I do still believe there are some really important ideas that EA prioritises but I’m more circumspect about some of the things I think we’re not doing as well as we could (e.g. Toby’s thoughts here and Holden’s caution about maximising here and things I’ve posted about myself). Overall, I’m personally keen to take a step away from EA meta at least for a bit and try and do something that helps people where the route to impact is more direct and doesn’t go via the EA community.
Less convinced of working on AI risk—Over the last year I’ve also become relatively less convinced about x-risk from AI—especially the case that agentic deceptive strategically-aware power-seeking AI is likely. I’m fairly convinced by the counterarguments e.g. this and this and I’m worried at the meta level about the quality of reasoning and discourse e.g. this. Though I’m still worried about a whole host of non-x-risk dangers from advanced AI. That makes me much more excited to work on something bio or global health related.
So overall it seems like it was good to move on to something new and it took me a little while to find something I was as excited about as CE’s incubator programme!
I’ll be at EAG London this weekend! And will hopefully you’ll hear more from me later this year about the new thing I’m working on—so keep an eye out as no doubt I’ll be fundraising and/or hiring at some point! :)
Congratulations to you for being accepted into the incubator program. Am still expecting mine as well.
I love this!
Best of luck with your new gig; excited to hear about it! Also, I really appreciate the honesty and specificity in this post.
Congrats for being accepted into the incubator program! Hope it goes well for you!
I’ve said it privately, but I say publicly also:
I am really happy that you are sticking to your principles and what you think is right and enjoyable. I can’t wait to see if you found an organisation, what it’s like and thank you so much for all the work you’ve done.
So excited for you!
An alternate stance on moderation (from @Habryka.)
This is from this comment responding to this post about there being too many bans on LessWrong. Note how the LessWrong is less moderated than here in that it (I guess) responds to individual posts less often, but more moderated in that I guess it rate limits people more without reason.
I found it thought provoking. I’d recommend reading it.
This is a pretty opposite approach to the EA forum which favours bans.
I sense this is quite different to the EA forum too. I can’t imagine a mod saying I don’t pay much attention to whether the user in question is “genuinely trying”. I find this honesty pretty stark. Feels like a thing moderators aren’t allowed to say. “We don’t like the quality of your comments and we don’t think you can improve”.
Again this is very blunt but I’m not sure it’s wrong.
It feels cringe to read that basically if I don’t get the sequences lessWrong might rate limit me. But it is good to be open about it. I don’t think the EA forum’s core philosophy is as easily expressed.
I want to throw in a bit of my philosophy here.
Status note: This comment is written by me and reflects my views. I ran it past the other moderators, but they might have major disagreements with it.
I agree with a lot of Jason’s view here. The EA community is indeed much bigger than the EA Forum, and the Forum would serve its role as an online locus much less well if we used moderation action to police the epistemic practices of its participants.
I don’t actually think this that bad. I think it is a strength of the EA community that it is large enough and has sufficiently many worldviews that any central discussion space is going to be a bit of a mishmash of epistemologies.[1]
Some corresponding ways this viewpoint causes me to be reluctant to apply Habryka’s philosophy:[2]
Something like a judicial process is much more important to me. We try much harder than my read of LessWrong to apply rules consistently. We have the Forum Norms doc and our public history of cases forms something much closer to a legal code + case law than LW has. Obviously we’re far away from what would meet a judicial standard, but I view much of my work through that lens. Also notable is that all nontrivial moderation decisions get one or two moderators to second the proposal.
Related both to the epistemic diversity, and the above, I am much more reluctant to rely on my personal judgement about whether someone is a positive contributor to the discussion. I still do have those opinions, but am much more likely to use my power as a regular user to karma-vote on the content.
Some points of agreement:
Agreed. We are much more likely to make judgement calls in cases of new users. And much less likely to invest time in explaining the decision. We are still much less likely to ban new users than LessWrong. (Which, to be clear, I don’t think would have been tenable on LessWrong when they instituted their current policies, which was after the launch of GPT-4 and a giant influx of low quality content.)
Most of the work I do as a moderator is reading reports and recommending no official action. I have the internal experience of mostly fighting others to keep the Forum an open platform. Obviously that is a compatible experience with overmoderating the Forum into an echo chamber, but I will at least bring this up as a strong point of philosophical agreement.
Final points:
I do think we could potentially give more “near-ban” rate limits, such as the 1 comment/3 days. The main benefit of this I see is as allowing the user to write content disagreeing with their ban.
Controversial point! Maybe if everyone adopted my own epistemic practices the community would be better off. It would certainly gain in the ability to communicate smoothly with itself, and would probably spend less effort pulling in opposite directions as a result, but I think the size constraints and/or deference to authority that would be required would not be worth it.
Note that Habryka has been a huge influence on me. These disagreements are what remains after his large influence on me.
I think the banned individual should almost always get at least one final statement to disagree with the ban after its pronouncement. Even the Romulans allowed (will allow?) that. Absent unusual circumstances, I think they—and not the mods—should get the last word, so I would also allow a single reply if the mods responded to the final statement.
More generally, I’d be interested in ~”civility probation,” under which a problematic poster could be placed for ~three months as an option they could choose as an alternative to a 2-4 week outright ban. Under civility probation, any “probation officer” (trusted non-mod users) would be empowered to remove content too close to the civility line and optionally temp-ban the user for a cooling-off period of 48 hours. The theory of impact comes from the criminology literature, which tells us that speed and certainty of sanction are more effective than severity. If the mods later determined after full deliberation that the second comment actually violated the rules in a way that crossed the action threshold, then they could activate the withheld 2-4 week ban for the first offense and/or impose a new suspension for the new one.
We are seeing more of this in the criminal system—swift but moderate “intermediate sanctions” for things like failing a drug test, as opposed to doing little about probation violations until things reach a certain threshold and then going to the judge to revoke probation and send the offender away for at least several months. As far as due process, the theory is that the offender received their due process (consideration by a judge, right to presumption of innocence overcome only by proof beyond a reasonable doubt) in the proceedings that led to the imposition of probation in the first place.
If you remove ones for site-integrity reasons (spamming DMs, ban evasion, vote manipulation), bans are fairly uncommon. In contrast, it sounds like LW does do some bans of early-stage users (cf. the disclaimer on this list), which could be cutting off users with a high risk of problematic behavior before it fully blossoms. Reading further, it seems like the stuff that triggers a rate limit at LW usually triggers no action, private counseling, or downvoting here.
As for more general moderation philosophy, I think the EA Forum has an unusual relationship to the broader EA community that makes the moderation approach outlined above a significantly worse fit for the Forum than for LW. As a practical matter, the Forum is the ~semi-official forum for the effective altruism movement. Organizations post official announcements here as a primary means of publishing them, but rarely on (say) the effectivealtruism subreddit. Posting certain content here is seen as a way of whistleblowing to the broader community as a whole. Major decisionmakers are known to read and even participate in the Forum.
In contrast (although I am not an LW user or a member of the broader rationality community), it seems to me that the LW forum doesn’t have this particular relationship to a real-world community. One could say that the LW forum is the official online instantiation of the LessWrong community (which is not limited to being an online community, but that’s a major part of it). In that case, we have something somewhat like the (made-up) Roman Catholic Forum (RCF) that is moderated by designees of the Pope. Since the Pope is the authoritative source on what makes something legitimately Roman Catholic, it’s appropriate for his designees to employ a heavier hand in deciding what posts and posters are in or out of bounds at the RCF. But CEA/EVF have—rightfully—mostly disowned any idea that they (or any other specific entity) decide what is or isn’t a valid or correct way to practice effective altruism.
One could also say that the LW forum is an online instantiation of the broader rationality community. That would be somewhat akin to John and Jane’s (made up) Baptist Forum (JJBF) that is moderated by John and Jane. One of the core tenets of Baptist polity is that there are no centralized, authoritative arbiters of faith and practice. So JJBF is just one of many places that Baptists and their critics can go to discuss Baptist topics. It’s appropriate for John and Jane to to employ a heavier hand in deciding what posts and posters are in or out of bounds at the JJBF because there are plenty of other, similar places for them to go. JJBF isn’t anything special. But as noted above, that isn’t really true of the EA Forum because of its ~semi-official status in a real-world social movement.
It’s ironic that—in my mind—either a broader or narrower conception of what LW is would justify tighter content-based moderation practices, while those are harder to justify in the in-between place that the EA Forum occupies. I think the mods here do a good job handling this awkward place for the most part by enforcing viewpoint-neutral rules like civility and letting the community manage most things through the semi-democratic karma method (although I would be somewhat more willing to remove certain content than they are).
This also roughly matches my impression. I do think I would prefer the EA community to either go towards more centralized governance or less centralized governance in the relevant way, but I agree that given how things are, the EA Forum team has less leeway with moderation than the LW team.
Ben West recently mentioned that he would be excited about a common application. It got me thinking a little about it. I don’t have the technical/design skills to create such a system, but I want to let my mind wander a little bit on the topic. This is just musings and ‘thinking out out,’ so don’t take any of this too seriously.
What would the benefits be for some type of common application? For the applicant: send an application to a wider variety of organizations with less effort. For the organization: get a wider variety of applicants.
Why not just have the post openings posted to LinkedIn and allow candidates to use the Easy Apply function? Well, that would probably result in lots of low quality applications. Maybe include a few question to serve as a simple filter? Perhaps a question to reveal how familiar the candidate is with the ideas and principles of EA? Lots of low quality applications aren’t really an issue if you have an easy way to filter them out. As a simplistic example, if I am hiring for a job that requires fluent Spanish, and a dropdown prompt in the job application asks candidates to evaluate their Spanish, it is pretty easy to filter out people that selected “I don’t speak any Spanish” or “I speak a little Spanish, but not much.”
But the benefit of Easy Apply (from the candidate’s perspective) is the ease. John Doe candidate doesn’t have to fill in a dozen different text boxes with information that is already on his resume. And that ease can be gained in an organization’s own application form. An application form literally can be as simple as prompts for name, email address, and resume. That might be the most minimalistic that an application form could be while still being functional. And there are plenty of organizations that have these types of applications: companies that use Lever or Ashby often have very simple and easy job application forms (example 1, example 2).
Conversely, the more than organizations prompt candidates to explain “Why do you want to work for us” or “tell us about your most impressive accomplishment” the more burdensome it is for candidates. Of course, maybe making it burdensome for candidates is intentional, and the organization believes that this will lead to higher quality candidates. There are some things that you can’t really get information about by prompting candidates to select an item from a list.
With the US presidential election coming up this year, some of y’all will probably want to discuss it.[1] I think it’s a good time to restate our politics policy. tl;dr Partisan politics content is allowed, but will be restricted to the Personal Blog category. On-topic policy discussions are still eligible as frontpage material.
Or the expected UK elections.
I don’t think we have a good answer to what happens after we do auditing of an AI model and find something wrong.
Given that our current understanding of AI’s internal workings is at least a generation behind, it’s not exactly like we can isolate what mechanism is causing certain behaviours. (Would really appreciate any input here- I see very little to no discussion on this in governance papers; it’s almost as if policy folks are oblivious to the technical hurdles which await working groups)