An eccentric dreamer in search of truth and happiness for all. I formerly posted on Felicifia back in the day under the name Darklight and still use that name on Less Wrong. I’ve been loosely involved in Effective Altruism to varying degrees since roughly 2013.
Joseph_Chu
The percentage of EAs earning to give is too low
I’m not very confident in this view, but I’m philosophically somewhat against encouraging Earning-To-Give as it can justify working at what I see as unethical high paying jobs (i.e. finance, the oil industry, AI capabilities, etc.) and pretending you can simply offset it with enough donations. I think actions like this condone the unethical, making it more socially acceptable and creating negative higher order effects, and that we shouldn’t do this. It’s also a slippery slope and entails ends justifies the means thinking, like what SBF seems to have thought, and I think we should be cautious about potentially following such an example.
I also, separately, think that we should respect the autonomy of the people making decisions about their careers, and that those who want to EtG and who have the personal fit for it are likely already doing that, and suggesting more people should do so is somewhat disrespectful of the autonomy and ability to make rational, moral decisions of those who choose otherwise.
Quick question! What’s the best way to handle having long gaps on your resume?
So, I used to be a research scientist in AI/ML at Huawei Canada (circa 2017-2019), which on paper should make me a good candidate for AI technical safety work. However, in recent years I pivoted into game development, mostly because an EA friend and former moral philosophy lecturer pitched the idea of a Trolley Problem game to me and my interviews with big tech had gone nowhere (I now have a visceral disdain for Leetcode). Unfortunately, the burn rate of the company now means I can’t be paid anymore, so I’m looking around at other things again.
Back in 2022, I went to EA Global Washington DC and got some interviews with AI safety startups like FAR and Generally Intelligent, but couldn’t get past the technical interviews. As such, I’m not sure I’m actually qualified to be an AI safety technical researcher. I also left Huawei in part due to mental health issues making it difficult to work in such a high stress environment.
I’ve also considered doing independent AI safety research, and applied to the LTFF before and been rejected without feedback. I also applied to 80,000 Hours a while back and was also rejected.
Regularly reading the EA Forums and Less Wrong makes me continue to think AI safety work is the most important thing I could do, but at the same time, I have doubts I won’t mess up and waste people’s time and money that could go to more capable people and projects. I also have a family now, so I can’t just move to the Bay Area/London and burn my life for the cause either.
What should I do?
I should point out that the natural tendency for civilizations to fall appears to apply to subsets of the human civilization, rather than the entirety of humanity historically. While locally catastrophic, these events were not existential, as humanity survived and recovered.
I’d also argue that the collapse of a civilization requires far more probabilities to go to zero and has greater and more complex causal effects than all time machines just failing to work when tried.
And, the reality is that at this time we do not know if the Non-Cancel Principle is true or false, and whether or not the universe will prevent time travel. Given this, we face the dilemma that if we precommit to not developing time travel and time travel turns out to be possible, then we have just limited ourselves and will probably be outcompeted by a civilization that develops time travel instead of us.
Ah, that makes sense! Thanks for the clarification.
Why would the only way to prevent timeline collapse be to prevent civilizations from achieving black hole-based time travel? Why not just have it so that whenever such time travel is attempted, any attempts to actually change the timeline simply fail mysteriously and events end up unfolding as they did regardless?
Like, you could still go back as a tourist and find out if Jesus was real, or scan people’s brains before they die and upload them into the future, but you’d be unable to make any changes to history, and anything you did would actually end up bringing about the events as they originally occurred.
I also don’t see how precommitting to anything will escape the “curse”. The universe isn’t an agent we can do acausal trade with. Applying the Anthropic Principle, we either are not the type of civilization that will ever develop time travel, or there is no “curse” that prevents civilizations like ours from developing time travel. Otherwise, we already shouldn’t exist as a civilization.
So, it seems like most of the existential risks from time travel are only if the Non-Cancel Principle you described is false? It also seems like the Non-Cancel Principle also prevents most time paradoxes, so that seems like strong evidence towards it being true?
It seems like the Non-Cancel Principle would lead to only two possible ways time travel could go about. Either everything “already happened” and so time travel can only cause events to happen as they did (i.e. Tenet), meaning no actual changes or new timelines are possible (no free will), or alternatively, time travel branches the timeline, creating new timelines in a multiverse of possible worlds (in which case, where did the energy for this timeline come from if Conservation of Energy holds?).
I find the latter option more interesting for science fiction, but I think the former probably makes more sense from a physics perspective. I would really like to be wrong on this though, because useful time travel would be really cool and possibly the most important and valuable technology that one could have (that or ASI).
Anyway, interesting write up! I’ve personally spent a lot of time thinking about time travel and its possible mechanics, as it’s a fascinating concept to me.
P.S. This is Darklight from Less Wrong.
I mean, that innate preference for oneself isn’t objective in the sense of being a neutral outsider view of things. If you don’t see the point of having an objective “point of view of the universe” view about stuff, then sure, there’s no reason to care about this version of morality. I’m not arguing that you need to care, only that it would be objective and possibly truth tracking to do so, that there exists a formulation of morality that can be objective in nature.
I guess the main intuitional leap that this formulation of morality takes is the idea that if you care about your own preferences, you should care about the preferences of others as well, because if your preferences matter objectively, theirs do as well. If your preferences don’t matter objectively, why should you care about anything at all?
The principle of indifference as applied here is the idea that given that we generally start with maximum uncertainty about the various sentients in the universe (no evidence in any direction about their worth or desert), we should assign equal value to each of them and their concerns. It is admittedly an unusual use of the principle.
You could argue that if moral realism is true, that even if our models of morality are probably wrong, you can be less wrong about them by acquiring knowledge about the world that contains relevant moral facts. We would never be certain they are correct, but we could be more confident about them in the same way we can be confident about a mathematical theory being valid.
I guess I should explain what my version of moral realism would entail.
Morality to my understanding is, for a lack of a better phrase, subjectively objective. Given a universe without any subjects making subjective value judgments, nothing would matter (it’s just a bunch of space rocks colliding and stuff). However, as soon as you introduce subjects capable of experiencing the universe and having values and making judgments about the value of different world states, we have the capacity to make “should” statements about the desirability of given possible world states. Some things are now “good” and some things are now “bad”, at least to a given subject. From an objective, neutral, impartial point of view, all subjects and their value judgments are equally important (following the Principle of Indifference aka the Principle of Maximum Entropy).
Thus, as long as anyone anywhere cares about something enough to value or disvalue it, it matters objectively. The statement that “Alice cares about not feeling pain” and its hedonic equivalent “Alice experiences pain as bad” is an objective moral fact. Given that all subjects are equal (possibly in proportion to degree of sentience, not sure about this), then we can aggregate these values and select the world state that is most desirable overall (greatest good for the greatest number).
The rest of morality, things like universalizable rules that generally encourage the greatest good in the long run, are built on top of this foundation of treating the desires/concerns/interests/well-being/happiness/Eudaimonia of all sentient beings throughout spacetime equally and fairly. At least, that’s my theory of morality.
So, regarding the moral motivation thing, moral realism and motivational internalism are distinct philosophical concepts, and one can be true without the other also being true. Like, there could be moral facts, but they might not matter to some people. Or, maybe people who believe things are moral are motivated to act on their theory of morality, but the theory isn’t based on any moral facts but are just deeply held beliefs.
The latter example could be true regardless of whether moral realism is true or not. For instance, the psychopath might -think- that egoism is the right thing to do because their folk morality is that everyone is in it for themselves and suckers deserve what they get. This isn’t morality as we might understand it, but it would function psychologically as a justification for their actions to them (so they sleep better at night and have a more positive self-image) and effectively be motivating in a sense.
Even -if- both moral realism and motivational internalism were true, this doesn’t mean that people will automatically discover moral facts and act on them reliably. You would basically need to have perfect information and be perfectly rational for that to happen, and no one has these traits in the real world (except maybe God, hypothetically).
Ah, good catch! Yeah, my flavour of moral realism is definitely naturalist, so that’s a clear distinction between myself and Bentham, assuming you are correct about what he thinks.
I’ll admit I kinda skimmed some of Bentham’s arguments and some of them do sound a bit like rhetoric that rely on intuition or emotional appeal rather than deep philosophical arguments.
If I wanted to give a succinct explanation for my reasons for endorsing moral realism, it would be that morality has to do with what subjects/sentients/experiencers value, and these things they value, while subjective in the sense that they come from the perceptions and judgments of the subjects, are objective in the sense that these perceptions and in particular the emotions or feelings experienced because of them, are true facts about their internal state (i.e. happiness and suffering, desires and aversions, etc.). These can be objectively aggregated together as the sum of all value in the universe from the perspective of an impartial observer of said universe.
Most of the galactic x-risks should be limited by the speed of light (because causality is limited by the speed of light), and would, if initiated, probably expand like a bubble from their source, again, propagating outward at the speed of light. Thus, assuming a reasonably random distribution of alien civilizations, there should regions of the universe that are currently unaffected by that one or more alien civilizations causing a galactic x-risk to occur. We are most probably in such a region, otherwise we would not exist. So, yes, the Anthropic Principle applies in the sense that we eliminate a possibility (x-risk causing aliens nearby), but we don’t eliminate all the other possibilities (alone in the region or non-x-risk causing aliens nearby), which is what I mean. I should have explained that better.
Also, the reality is that our long-term future is limited by the eventual heat death of the universe anyway (we will eventually run out of usable energy), so there is no way for our civilization to last forever (short of some hypothetical time travel shenanigans). We can at best delay the inevitable, and maximize the flourishing that occurs over spacetime.
Morality is Objective
I’ve been a moral realist for a very long time and generally agree with this post.
I will caveat though that there is a difference between moral realism (there are moral truths), and motivational internalism (people will always act according to those truths when they know them). I think the latter is much less clearly true, and one of the primary confusions that occur when people argue about moral realism and AI safety.
I also think that moral truths are knowledge, and we can never know things with 100% certainty. This means that even if there are moral truths in the world (out there), it is very possible to still be wrong about what they are, and even a superintelligence may not figure them out necessarily. Like most things, we can develop models, but they will generally not be complete.
I’m not sure I agree that the Anthropic Principle applies here. It would if ALL alien civilizations are guaranteed to be hostile and expansionist (i.e. grabby aliens), but I think there’s room in the universe for many possible kinds of alien civilizations, and so if we allow that some but not all aliens are hostile expansionists, then there might be pockets of the universe where an advanced alien civilization quietly stewards their region. You could call them the “Gardeners”. It’s possible that even if we can’t exist in a region with Grabby Aliens, we could still either exist in an empty region with no aliens, or a region with Gardeners.
Also, realistically, if you assume that the reach of an alien civilization spreads at the speed of light, but the effective expansion rate is much slower due to not needing the space until it’s already filled up with population and megastructures, it’s very possible that we might be within the reach of advanced aliens who just haven’t expanded that far yet. Naturally occurring life might be rare enough that they might see value in not destroying or colonizing such planets, say, seeing us as a scientifically valuable natural experiment, like the Galapagos were to Darwin.
So, I think there’s reasons why advanced aliens aren’t necessarily mutually exclusive with our survival, as the Anthropic Principle would require.
Given, I don’t know which of empty space or Gardeners or late expanders is more likely, and would hesitate to assign probabilities to them.
Thanks for the thoughts!
I do think the second one has more potential impact if it works out, but I also worry that it’s too “out there” speculative and also dependent on the AGI being persuaded by an argument (which they could just reject), rather than something that more concretely ensures alignment. I also noticed that almost no one is working on the Game Theory angle, so maybe it’s neglected, or maybe the smart people all agree it’s not going to work.
The first project is probably more concrete and actually uses my prior skills as an AI/ML practitioner, but also, there’s a lot of people already working on Mech Int stuff. In comparison, my knowledge of Game Theory is self-taught and not very rigorous.
I’m tempted to explore both to an extent. The first one I can probably do some exploratory experiments to test the basic idea, and rule it out quickly if it doesn’t work.
I more or less agree. It’s not really a complaint from me. I probably was too provocative in my choice of wording earlier.
I want to clarify that I don’t think ideas like the Orthogonality Thesis or Instrumental Convergence are wrong. They’re strong predictive hypotheses that follow logically from very reasonable assumptions, and even the possibility that they could be correct is more than enough justification for AI safety work to be critical.
I was more just pointing out some examples of ideas that are very strongly held by the community, that happen to have been named and popularized by people like Bostrom and Yudkowsky, both of whom might be considered elites among us.
P.S. I’m always a bit surprised that the Neel Nanda of Google DeepMind has the time and desire to post so much on the EA Forums (and also Less Wrong). That probably says very good things about us, and also gives me some more hope that the folks at Google are actually serious about alignment. I really like your work, so it’s an honour to be able to engage with you here (hope I’m not fanboying too much).
As an EA and a Christian… I find Thiel’s apparent views and actions to me resemble what the Bible says an Antichrist is, more than EA by far. He is hypocritically calling EA totalitarian while simultaneously, deeply supporting what amounts to technofascism in the U.S.
It is bizarre to me how unchristian his version of libertarianism is, with what seems like a complete indifference, if not utter disdain, towards the poor and downtrodden who Jesus sought to help. Thiel seems to be so far from the spirit of Christian values (at least as I understand them) that I have a hard time imagining what could be further from it.
I could go on, but people like this, who call themselves Christian and yet appear to be the polar opposite of what a good Christian ought to be (again, in my opinion) infuriate me to the point that I have trouble expressing things without getting angry, so I’ll stop here.