Background
Hi there, i’m from Tasmania, Australia, currently living in NSW. I have mostly been in academia studying medical science, neuroscience, philosophy of mind, and most recently public health—publishing on the barriers to a sugary drinks tax in Australia. I have also worked as a wilderness guide, science educator, and most recently in the department of health in health promotion.
Engagement in EA
In 2015 I attended a talk by Peter Singer and was introduced to EA ideas which mostly remained dormant until around 2023 when I tried to apply for some EA related work with little success. From 2025 I have been more seriously involved and have taken various courses including CEA’s intro and in-depth programs. More specifically I am currently learning more about AI safety and I could contribute. I have recently finished the Center for AI Safety’s, AI Safety, Ethics, and Society course including producing a <2000 word article exploring the tension between AI ethics and AI safety. I am currently enrolled in BlueDot’s AGI Strategy Course. I attended EAGxVirtual 2024 and will be attended EAGxAustraliasia later this year.
Tristan D
My understanding of their claim is more something like:
There will never be an AI system that is generally intelligent like a human is. Put differently, we can never have synoptic models of the behaviour of human beings. Where a synoptic model is a model that can be used to engineer the system or to emulate its behaviour.
This is because, in order for deep neural networks to work they need to be trained on data with the same variance as the data to which the trained algorithm will be applied. I.e. training data which is representative. General human intelligence is a complex system which doesn’t have said distribution. There are mathematical limits on the ability to predict the behaviours of a system like this that does not have a distribution. Therefore whilst we can have gradually more satisfactory simple models of human behaviour (like chatGPT is for written language) they will never the same level as humans.To put it simply: We can’t create AGI by training algorithms on datasets, because human intelligence does not have a representative data set.
As such I think your response “It is possible to create new things without fundamentally understanding how they work internally” misses the mark. The claim is not that “we can’t understand how to model complex systems therefore they can’t be adequately modelled”. It’s more something like “there are fundamental limits to the possibility of adequately emulating complex systems (including humans), regardless of whether we understand them more or not”.
My personal take is that i’m unsure how important it is to be able to accurately model human intelligence. Perhaps modelling some approximate of human intelligence (in the way that chatGPT is approximately good enough as a written chat bot) is sufficient enough to stimulate the creation of something more approximately intelligent than that, and so on. In the same way that chatGPT can answer PhD level questions that the majority of humans cannot.
Note: My understanding of Barry’s argument is limited to this lecture (https://www.youtube.com/watch?v=GV1Ma2ehpxo) and this article (http://www.hunfi.hu/nyiri/AI/BS_paper.pdf).
I think I might be missing what’s distinctive here. A lot of the traits listed — strong knowledge of the field, engagement with the community, epistemic humility, broad curiosity — seem like general predictors of success in any many fields.
Are you pointing at something that’s unusually clustered in EA, or is your claim more about how trainable and highly predictive this combination is within the EA context?
Have you had a chat with the 80k hours team?
Yes I agree it’s bad to support a race, but it’s not as simple as that.
Is OpenAI going for profit in order to attract more investment/talent a good or bad thing for AI safety?
On the one hand people want American companies to win the AGI race, and this could contribute to that. On the other hand, OpenAI would be then be more tied to making profit which could conflict with AI safety goals.
It seems to me that the limiting step here is the ability to act like an agent. If we already have AI that can reason and answer questions at a PhD level, why would we need reasoning and question answering to be any better?
The point is, there are 8.7 million species alive today, therefore there is a possibility that a significant number of these play important, high impact, roles.
I have the opposite intuition for biodiversity. People have been studying ecosystem services for decades and higher biodiversity is associated with increased ecosystem services, such as clean water, air purification, and waste management. Higher biodiversity is also associated with reduce transmission of infectious diseases by creating more complex ecosystems limiting pathogen spread. Then we have the actual and possible discovery of medicinal compounds and links with biodiversity and mental health. These are high level examples of the benefits. The linked article gives the possibility of impact by considering two effects from bats and vultures. Multiply that effect by 1000+ other species, include all the other impacts previously mentioned and I can see how this could be high impact.
There are a variety of views on the potential moral status of AI/robots/machines into the future.
With a quick search it seems there are arguments for moral agency if functionality is equivalent to humans, or when/if they become capable of moral reasoning and decision-making. Others argue that consciousness is essential for moral agency and that the current AI paradigm is insufficient to generate consciousness.
I was also interested to follow this up. For the source of this claim he cites another article he has written ‘Is it time for robot rights? Moral status in artificial entities’ (https://link.springer.com/content/pdf/10.1007/s10676-021-09596-w.pdf).
The Bottleneck in AI Policy Isn’t Ethics—It’s Implementation
The Rise of AI Agents: Consequences and Challenges Ahead
This is fantastic!
Do you know if anything like this exists for other cause areas, or the EA world more broadly?
I have been compiling and exploring resources available for people interested in EA and different cause areas. There is a lot of organisations and opportunities to to get career advice, or undertake courses, or get involved in projects, but it is all scattered and there is no central repository or guide for navigating the EA world that I know of.
We are talking about making decisions whose outcome is one of the best things we can do for the far future.
An option can be the best thing you can do because it averts a terrible outcome, as opposed to achieving the best possible outcome.
This is probably a semantic disagreement but averting a terrible outcome could be viewed as one of the best things we can do for the far future. The part I was disagreeing with was when you said “I’m just saying one attractor state is better than the other in expectation, not that one of them is so great.”. This gives the impression that longtermism is satisfied with prioritising one option in comparison to another, regardless of the context of other options which if considered would produce outcomes that are “near-best overall”. And as such it’s a somewhat strange claim that one of the best things you could do for the far future is in actuality “not so great”.
I don’t understand some of what you’re saying including on ambiguity.
My point could be ultimately be summarised by saying, how do you know that freedom (or any other value), will even makes sense in the far future, let alone valued? You don’t. You’re just assuming it makes sense and will be valued, because it makes sense and is valued now. While that may be sufficient for an argument in reference to the near future, I think it’s a very weak argument to defend its relevance for the far future.
At it’s heart, the “inability to predict” arguments really hold strongly onto the sense that the far future is likely to be radically different and therefore you are making a claim to having knowledge of what is ‘good’ in this radically different future.
Could I be wrong, sure, but we are doing things based on expectation.
I feel like “expectation” is doing far too much work in these arguments. It’s not convincing to just claim something is likely or expected, that just begs the question, why is it likely or expected.
Nevertheless I think the focus on non-existential risk examples like the US having dominance over China is a red herring for defending longtermism. I think the strongest claims are those for taking action on preventing existential risk. But there the action’s are still subject to the same criticisms regarding the inability to predict how they will actually positively influence the far future.
For example, take reducing exitential risk by developing some sort of asteroid defense system. While in the short term developing an asteroid defense system might seem to adequately contribute to the goal of reducing existential risk. It’s unclear how asteroid defense systems or other mitigation policies might interact with other technologies or societal developments in the far future. For example, advanced asteroid deflection technologies could have dual-use potentials (like space weaponization) that could create new risks or unforeseen consequences. Thus, while reducing risk associated with asteroid impacts has immediate positive effects, the net effect on the far future is more ambiguous.
There is also an accounting issue that distorts the estimates of the impact of particular actions on the far future. Calculating the expected value of minimising the existential risk associated with an asteroid impact for example, doesn’t take into account changes in expected value over time. For a simple example, as soon as humans start living comfortably, in addition to but beyond Earth (for example on Mars), the existential risk from an asteroid impact declines dramatically, and further declines are made as we extend out further through the solar system and beyond. Yet the expected value is calculated on the time horizon whereby the value of this action, reducing risk from asteroid impact, will endure for the rest of time, when in reality, the value of this action, as originally calculated, will only endure for probably less than 50 years.
How did the book club on Deep Utopia go? Is there an online discussion of the book somewhere?
I’d be happy to have a call sometime if you want as that might be more efficient than this back and forth.
For now i’m finding the gaps in between useful for reflecting, thanks though. Perhaps in the future!
Not necessarily. To use the superintelligence example, the world will look radically different under either the US or China having superintelligence than it does now.
The world will be radically different, yet you feel confident in predicting that some element of this radically different world will remain constant for a very long time and this being so, moving towards this state is one of the best options for the far future.
I’m just saying one attractor state is better than the other in expectation, not that one of them is so great.
I think you may be departing from strong longtermism. The first proposition for ASL is “Every option that is near-best overall is near-best for the far future.” We are talking about making decisions whose outcome is one of the best things we can do for the far future. It’s not merely something that is better than something deemed terrible.
Yeah I guess in this case I’m talking about the US having dominance over the world as opposed to China having dominance over the world.
Perhaps I didn’t explain the point about ambiguity well enough. Of all possible states, S, there is some possible state X, that is ‘near-best’, ‘best-possible’, ‘close to best’, what have you, for the far future. Call the ‘near-best’ state for the far future n-bX. There are microstates of n-bX that make it such that it is this ‘near-best’ state. Presumably you need to have some idea of what these microstates are, in order to make predictions regarding what we can do today that will lead towards them.
Therefore, there must be something about the state of the US having dominance over the world as opposed to China, that will presumably lead to the instantiation of some of these microstates of n-bX. Presumably, these beneficial microstates of n-bX don’t involve a country called “the US” and a country called “China”, and arguably lack the property of “dominance”.
So there must be some other thing, state, or property, call it n-bP, whose long term instantiation in the near-present world, is linked to n-bX. So the questions are, what is n-bP, and how is n-bP hypothesised to be linked to “US dominance..”, and how is it hypothesised to be instantiated for a very long time, and how is it hypothesised to be linked to n-bX. It’s ambiguous on all these questions.
I think it’s fair to say I’d rather the US control the world than China control the world given the different values the two countries hold.
We are not talking about what you would rather, we’re talking about what the far future would rather. I get the sense that what you are really defending are ways to incrementally improve the world that are currently under-appreciated. I don’t have an issue with that. What I am unconvinced by, is how reference to the lives of beings quadrillions of years into the future, can meaningfully guide our decisions.
It appears you are saying:
a) We should take actions such that we will enter into a state in the near future that will endure and last a very long time.
b) These near future states that will endure for a long time will be the best states for the beings in the far future.
If so, by design actions now need to lead, relatively quickly, into such an ‘attractor state’ otherwise these actions are subject to the same ‘washing out’ criticism that they were designed to avoid. That means that the state that is hypothesised to last for an extremely long time, is a state that is close to the present state. But then we are left with the somewhat surprising claim that a state that we can establish in the near future which lasts an extremely long time is one of the best things we can do for the far future. On the face of it I find the idea very implausible.
You can argue the states aren’t actually that persistent e.g. you don’t think superintelligence is that powerful or even realistic in the first place. Or you can argue one isn’t clearly better than the other. Or you can argue that there’s not much we can do to achieve one state over other.
From the perspective of the linked article unfortunately these points suffer from the same limitation of our inability to make predictions about the effect of our actions on the far future. Therefore, we lack the ability to predict which states would persist into the far future and which wouldn’t. And we lack the ability to predict which persistent states would be better than others for the far future.
For example, how exactly does the US winning the race to superintelligence lead to one of the best possible futures for quadrillions of people in the far future? How long is this state expected to last? What will happen once we are no longer in this state?
There is a related issue of ambiguity about what is included in “the state of things that last an extremely long time”. To adapt the terminology from The Case for Strong Longtermism paper, the space of all possible microstates that the world could be in at a single moment of time is S. Call the state that we are in at any given moment in time, subset of S, X. Call the subset of X that persists for an extremely long period of time A.
The world is continually moving through different states of X and the claim from the ‘attractor’ argument is that we should move towards versions of A that will positively influence the far future. However, obviously there are other subsets of X that continue to change, i.e. there are microstates of the world that continue to change, otherwise the world is frozen in time. The issue of ambiguity is, what set of microstates makes up A?
One version of A you claim is the US winning the race to superintelligence. When we are in the state of the US having won the race to superintelligence, what exactly is in A? What exactly is claimed to be persisting for a very long time? The US having dominance over the world? Liberty? Democracy? Wellbeing? And whatever it is, how is that influencing the quadrillions of lives in the far future, given that there is still a large subset of X which is changing.
Saving a life through bed nets just doesn’t seem to me to put the world in a better attractor state which makes it vulnerable to washing out. Medical research doesn’t either.
My comments in the previous post were not that bed nets and medical research are in fact better focus areas for influencing the far future, but rather in order to say they are not better focus areas for influencing the far future you need to show this. In order to show this you need to be able to predict how focus on these areas impacts the far future. We aren’t in a position to predict how focus on these areas effects the far future. Therefore we aren’t in a position to say that medical research/bed nets/‘insert other’ are better or worse focus areas for influencing the far future.
The washing out hypothesis is a different concern to what we are talking about here. The idea I have been discussing here is not that an intervention might become less significant as time goes on. An intervention could be extremely significant for the far future, or not significant at all. However, our ability to predict the impact of that intervention on the far future is outside our purview.
From the article:
The far-future effects of one’s actions are usually harder to predict than their near-future effects. Might it be that the expected instantaneous value differences between available actions decay with time from the point of action, and decay sufficiently fast that in fact the near-future effects tend to be the most important contributor to expected value?
Or perhaps the difficulty lies in the high number of causal possibilities the further we reach into the future.
As a result, although precise estimates of the relevant numbers are difficult, the far-future benefits of some such interventions seem to compare very favourably, by total utilitarian lights, to the highest available near-future benefits
In the article they compare the impact of an intervention (malaria bed nets) on the near future with the impact of an intervention (reducing x-risk from asteroids, global pandemics, AI risk) on the far future. As I said earlier, not an adequate comparison.
If we compare the positive impact of an intervention on quadrillions of people to a positive impact of an intervention on only billions of people, should we be surprised that the intervention that considers the impact on more people has a greater effect? Put another way, should we be surprised the bed net intervention has a smaller impact when we reduce the time horizon of its impact to the near future?
To this you might say, well interventions focused on malaria might have this ‘washing out’ effect. But so might interventions for reducing existential risk. For example, the intervention discussed in the paper to reduce extinction-level pandemics is to spend money on strengthening the healthcare system. Something that could easily be subject to the ‘washing out’ effect.
Nevertheless, the bed net intervention is only one intervention, and there are other interventions that could have more plausible effects on the far future which would be more adequate comparisons (if such comparisons were feasible in the first place), for example, medical research.
One way is to find interventions that steer between “attractor states”.
If extinction and non-extinction are “attractor states”, from what I gather, a state that is expected to last an extremely long time, what exactly isn’t an attractor state?
Increasing the probability of achieving the better attractor state (probably non-extinction by a large margin, if we make certain foundational assumptions) has high expected value that stretches into the far future.
Let me translate that sentence: Focusing on existential risk is more beneficial for the far future than other cause areas because it increases the probability of humans being alive for an extremely long time. If it’s more beneficial, we need the relevant comparison, as per above, the relevant comparison is lacking.
Great piece. I really connected with the part about the vastness of the possibility of conscious experience.
That said, I’m inclined to think that Utopia, however weird, would also be, in a certain sense, recognizable — that if we really understood and experienced it, we would see in it the same thing that made us sit bolt upright, long ago, when we first touched love, joy, beauty; that we would feel, in front of the bonfire, the heat of the ember from which it was lit. There would be, I think, a kind of remembering. As Lewis puts it: “The gods are strange to mortal eyes, and yet they are not strange.” Utopia would be weird and alien and incomprehensible, yes; but it would still, I think, be our Utopia; still the Utopia that gives the fullest available expression to what we would actually seek, if we really understood.
It sounds a little bit like you’re saying that utopia would be recognisable to modern day humans. If you are saying that, i’m not sure I would agree. Can a great ape have a revelatory experience that a human can have when taking in a piece of art? There exists art that can create the relevant experience, but I highly doubt if you showed every piece of art to any great ape that it would have such an experience. Therefore how can we expect the experiences available in utopia to be recognisable to a modern day human?
I won’t be going to EAGxBerlin but I do have some questions. I have attended a workshop with Bob Brown, the former leader of the Australian Greens. The workshop was about building up a network of grass roots lobbyists. Lobbying was framed primarily in terms of having one on one meetings with politicians and communicating your concerns, (in this case, environmental concerns, of which stopping native forest logging is among the highest priorities), in ways that connected with their interests.
My question is therefore, how much of lobbying is about these one on one appointments with politicians?
You also talk about organising workshops for civil servants. I’d be interested to hear more about how that came about. In these one on one conversations did you communicate the need for these workshops, and then eventually they agreed? That seems quite impressive that you could have a top down effect like that across government.
Can talk virtually if that’s easier.