Paul_Christiano

Karma: 3,521

Thoughts on responsible scaling policies and regulation

Paul_Christiano24 Oct 2023 22:25 UTC

177 points

5 comments6 min readEA link

Integrity for consequentialists

Paul_Christiano14 Nov 2016 20:56 UTC

172 points

18 comments8 min readEA link

Paul_Christiano 6 Jan 2024 18:48 UTC
118 points
20 ∶ 5
in reply to: lilly’s comment on: Survey of 2,778 AI authors: six parts in pictures
Quantitatively how large do you think the non-response bias might be? Do you have some experience or evidence in this area that would help estimate the effect size? I don’t have much to go on, so I’d definitely welcome pointers.
Let’s consider the 40% of people who put a 10% probability on extinction or similarly bad outcomes (which seems like what you are focusing on). Perhaps you are worried about something like: researchers concerned about risk might be 3x more likely to answer the survey than those who aren’t concerned about risk, and so in fact only 20% of people assign a 10% probability, not the 40% suggested by the survey.
Changing from 40% to 20% would be a significant revision of the results, but honestly that’s probably comparable to other sources of error and I’m not sure you should be trying to make that precise an inference.
But more importantly a 3x selection effect seems implausibly large to me. The survey was presented as being about “progress in AI” and there’s not an obvious mechanism for huge selection effects on these questions. I haven’t seen literature that would help estimate the effect size, but based on a general sense of correlation sizes in other domains I’d be pretty surprised by getting a 3x or even 2x selection effect based on this kind of indirect association. (A 2x effect on response rate based on views about risks seems to imply a very serious piranha problem)
The largest demographic selection effects were that some groups (e.g. academia vs industry, junior vs senior authors) were about 1.5x more likely to fill out the survey. Those small selection effects seem more like what I’d expect and are around where I’d set the prior (so: 40% being concerned might really be 30% or 50%).
many AI researchers just don’t seem too concerned about the risks posed by AI, so may not have opened the survey … the loaded nature of the content of the survey (meaning bias is especially likely),
I think the survey was described as about “progress in AI” (and mostly concerned progress in AI), and this seems like all people saw when deciding to take it. Once people started taking the survey it looks like there was negligible non-response at the question level. You can see the first page of the survey here, which I assume is representative of what people saw when deciding to take the survey.
I’m not sure if this was just a misunderstanding of the way the survey was framed. Or perhaps you think people have seen reporting on the survey in previous years and are aware that the question on risks attracted a lot of public attention, and therefore are much more likely to fill out the survey if they think risk is large? (But I think the mechanism and sign here are kind of unclear.)
specially when you account for the fact that it’s extremely unlikely other large surveys are compensating participants anywhere close to this well
If compensation is a significant part of why participants take the survey, then I think it lowers the scope for selection bias based on views (though increases the chances that e.g. academics or junior employees are more likely to respond).
I can see how other researchers citing these kinds of results (as I have!) may serve a useful rhetorical function, given readers of work that cites this work are unlikely to review the references closely
I think it’s dishonest to cite work that you think doesn’t provide evidence. That’s even more true if you think readers won’t review the citations for themselves. In my view the 15% response rate doesn’t undermine the bottom line conclusions very seriously, but if your views about non-response mean the survey isn’t evidence then I think you definitely shouldn’t cite it.
the fact that such a broad group of people were surveyed that it’s hard to imagine they’re all actually “experts” (let alone have relevant expertise),
I think the goal was to survey researchers in machine learning, and so it was sent to researchers who publish in the top venues in machine learning. I don’t think “expert” was meant to imply that these respondents had e.g. some kind of particular expertise about risk. In fact the preprint emphasizes that very few of the respondents have thought at length about the long-term impacts of AI.
Given my aforementioned concerns, I wonder whether the cost of this survey can be justified
I think it can easily be justified. This survey covers a set of extremely important questions, where policy decisions have trillions of dollars of value at stake and the views of the community of experts are frequently cited in policy discussions.
You didn’t make your concerns about selection bias quantitative, but I’m skeptical about quantitatively how much they decrease the value of information. And even if we think non-response is fatal for some purposes, it doesn’t interfere as much with comparisons across questions (e.g. what tasks do people expect to be accomplished sooner or later, what risks do they take more or less seriously) or for observing how the views of the community change with time.
I think there are many ways in which the survey could be improved, and it would be worth spending additional labor to make those improvements. I agree that sending a survey to a smaller group of recipients with larger compensation could be a good way to measure the effects of non-response bias (and might be more respectful of the research community’s time).
I am not inclined to update very much on what AI researchers in general think about AI risk on the basis of this survey
I think the main takeaway w.r.t. risk is that typical researchers in ML (like most of the public) have not thought about impacts of AI very seriously but their intuitive reaction is that a range of negative outcomes are plausible. They are particularly concerned about some impacts (like misinformation), particularly unconcerned about others (like loss of meaning), and are more ambivalent about others (like loss of control).
I think this kind of “haven’t thought about it” is a much larger complication for interpreting the results of the survey, although I think it’s fine as long as you bear it in mind. (I think ML researchers who have thought about the issue in detail tend if anything to be somewhat more concerned than the survey respondents.)
many AI researchers just don’t seem too concerned about the risks posed by AI
My impressions of academic opinion have been broadly consistent with these survey results. I agree there is large variation and that many AI researchers are extremely skeptical about risk.

Hiring engineers and researchers to help align GPT-3

Paul_Christiano1 Oct 2020 18:52 UTC

107 points

19 comments3 min readEA link

Paul_Christiano 12 Jul 2019 16:37 UTC
95 points
1 ∶ 0
on: Age-Weighted Voting
I like the goal of politically empowering future people. Here’s another policy with the same goal:
- Run periodic surveys with retrospective evaluations of policy. For example, each year I can pick some policy decisions from {10, 20, 30} years ago and ask “Was this policy a mistake?”, “Did we do too much, or too little?”, and so on.
- Subsidize liquid prediction markets about the results of these surveys in all future years. For example, we can bet about people in 2045′s answers to “Did we do too much or too little about climate change in 2015-2025?”
- We will get to see market odds on what people in 10, 20, or 30 years will say about our current policy decisions. For example, people arguing against a policy can cite facts like “The market expects that in 20 years we will consider this policy to have been a mistake.”
This seems particularly politically feasible; a philanthropist can unilaterally set this up for a few million dollars of surveys and prediction market subsidies. You could start by running this kind of poll a few times; then opening a prediction market on next year’s poll about policy decisions from a few decades ago; then lengthening the time horizon.
(I’d personally expect this to have a larger impact on future-orientation of policy, if we imagine it getting a fraction of the public buy-in that would be required for changing voting weights.)
What links here?
- MichaelA's comment on Representing future generations in the political process by Tobias_Baumann (27 Jun 2020 5:40 UTC; 9 points)

Paul_Christiano 14 Nov 2022 16:57 UTC
93 points
18 ∶ 3
in reply to: Aaron C’s comment on: NY Times on the FTX implosion’s impact on EA
I think the point of most non-profit boards is to ensure that donor funds are used effectively to advance the organization’s charitable mission. If that’s the case, then having donor representation on the board seems appropriate. Why would this represent a conflict of interest? My impression is that this is quite common amongst non-profits and is not considered problematic. (Note that Holden is on ARC’s board.)
I’m also not sure this what the NYT author is objecting to. I think they would be equally unhappy with SBF claiming to have donated a lot, but it secretly went to a DAF he controlled that he could potentially use to have influence later. The problem is more like trying to claim credit for good works despite not having actually given up the influence yet, not a COI issue.
(I don’t think it’s plausible to call “I gave my money to foundation or DAF, and then I make 100% of the calls about how the foundation donates” a COI issue. )
What links here?
- Dancer's comment on CEA/EV + OP + RP should engage an independent investigator to determine whether key figures in EA knew about the (likely) fraud at FTX by Tyrone-Jay Barugh (14 Nov 2022 22:00 UTC; 4 points)

Paul_Christiano 6 Jun 2023 18:09 UTC
91 points
17 ∶ 1
on: Transformative AGI by 2043 is <1% likely
I don’t think I understand the structure of this estimate, or else I might understand and just be skeptical of it. Here are some quick questions and points of skepticism.
Starting from the top, you say:
We estimate optimistically that there is a 60% chance that all the fundamental algorithmic improvements needed for AGI will be developed on a suitable timeline.
This section appears to be an estimate of all-things-considered feasibility of transformative AI, and draws extensively on evidence about how lots of things go wrong in practice when implementing complicated projects. But then in subsequent sections you talk about how even if we “succeed” at this step there is still a significant probability of failing because the algorithms don’t work in a realistic amount of time.
Can you say what exactly you are assigning a 60% probability to, and why it’s getting multiplied with ten other factors? Are you saying that there is a 40% chance that by 2043 AI algorithms couldn’t yield AGI no matter how much serial time and compute they had available? (It seems surprising to claim that even by 2023!) Presumably not that, but what exactly are you giving a 60% chance?
(ETA: after reading later sections more carefully I think you might be saying 60% chance that our software is about as good as nature’s, and maybe implicitly assuming there is a ~0% chance of being significantly better than that or building TAI without that? I’m not sure if that’s right though, if so it’s a huge point of methodological disagreement. I’ll return to this point later.)
In section 2 you say:
Transformative AGI by 2043 depends critically on the development of non-sequential reinforcement learning training methods with no real human analogue.
And give this a 40% probability. I don’t think I understand this claim or its justification. (This is related to my uncertainty about what your “60%” in the last section was referring to.)
It seems to me that if you had human-like learning you would be able to produce transformative AGI by 2043:
1. In fact it looks like human-like learning would enable AI to learn human-level physical skills:
  1. 10 years is sufficient for humans to learn most physical skills from scratch, and you are talking about 20 year timelines. So why is the serial time for learning even a candidate blocker?
  2. Humans learn new physical skills (including e.g. operating unfamiliar machinery) within tens of hours. This requires transfer from other things humans have learned, but those tasks are not always closely related (e.g. I learn to drive a car based on experience walking) and AI systems will have access to transfer from tasks that seem if anything more similar (e.g. prediction of the relevant physical environments, predictions of expert behavior in similar domains, closed-loop behavior in a wide range of simulated environments, closed-loop behavior on physical tasks with shorter timescales, behavior in virtual environments...).
  3. We can easily run tens of thousands of copies of AI systems in parallel. Existing RL is massively parallelizable. Human evolution gives no evidence about the difficulty of parallelizing learning in this way. Based on observations of human learning it seems extremely likely to me that parallelization 10,000 fold can reduce serial time by at least 10x (which is all that is needed). Extrapolations of existing RL algorithms seem to suggest serial requirements more like 10,000 episodes, with almost all of the compute used to run a massive number of episodes in parallel, which would be 1 year even for a 1-hour task. It seems hard to construct physical tasks that don’t provide rich feedback after even shorter horizons than 1 hour (and therefore suitable for a gradient descent step given enough parallel samples) so this seems pretty conservative.
2. Regardless of learning physical tasks, humans are able to learn to do R&D after 20 years of experience. AI systems operate at 10x speed and most environments relevant to hardware and software R&D can be sped up by at least 10x. So it seems like AI systems could be human-level at a wide range of tasks, sufficient to accelerate further AI progress, even if they just used non-parallelized human learning over 2 years. If you really thought physical tasks were somehow impossibly difficult (which I don’t think is justified) then this becomes the dominant path to AGI. This is particularly important because multiple of your later points also seem to rest on the distinctive difficulty of automating physical tasks, which should just shift your probability further and further to an explosion of automated R&D which drives automation of physical labor.
I think you are disagreeing with these claims, but I’m not sure about that. For example, you mention parallelizable learning but seem to give it <10% probability despite the fact that it is the overwhelmingly dominant paradigm in current practice and you don’t say anything about why it might not work.
(This isn’t super relevant to my mainline view, since in fact I think AI is much worse at learning quickly than humans and will likely be transformative way before reaching parity. This is related to the general point about being unnecessarily conjunctive, but here I’m just trying to understand and express disagreement with the particular path you lay out and the probabilities you assign.)
In section 3 you say:
Software and hardware efficiencies combine to surpass current computation cost efficiency, and/or the efficiency of the human brain, by at least five orders of magnitude.
I think you claim that each synapse firing event requires about 1-10 million floating point operations (with some error bars), and that there is only a 16% chance that computers will be able to do enough compute for $25/hour.
This is probably the part of the report I am most skeptical of:
- How do you square this with our experience in AI so far? Overall you seem to think it is possible that AI will be as effective as brains but unlikely to be much better. But if a biological neuron is probably ten million times more efficient than an artificial neuron, then aren’t we already much better than biology in tons of domains? Is there any task for which performance can be quantified and where you think this estimate provides a sane guideline to the inference-time compute required to solve the task? Shouldn’t you be putting significant probability on our algorithms being radically better than biology in many important ways?
  - Replicating the human visual cortex should take millions of times more compute than we have ever used, yet we can match human performance on a range of quantifiable perceptual tasks and are making rapid progress, and I’m actually not aware of tasks where it’s even plausible that we are 6 orders of magnitude away.
  - Learned policies for robotic control using only hundreds of thousands of neurons already seem to reach comparable competence to insects, but you should expect it to be significantly worse than a nematode. Aren’t you surprised to observe successful grasping and walking?
  - Traditional control systems like those used by Boston Dynamics seem to produce more competent motor control than small animals despite using amounts of compute close to 1 flop per synapse. You focus on ML, but I don’t know why—isn’t classical control a more reasonable point of comparison to small animals that have algorithms designed directly by evolution rather than learned in a lifetime, and doesn’t your argument very strongly predict that it should be impossible?
  - Qualitatively it’s hard to compare GPT-3 to humans, but just to be clear you are saying that it should behave like a brain with ~1000 neurons. This is at least surprising (e.g. I think would have led to big misses if it had been used to make any qualitative predictions), and to me casts doubt on a story where you can’t get transformative AI using less than the analog of a hundred billion neurons.
- Your biological analysis seems to hinge on the assertion that precise simulation of neurons is necessary to get similar levels of computational utility (and even from there the analysis is pretty conservative, e.g. by assuming that performing that you need to perform a very expensive computation thousands of times a second). I don’t personally consider this plausible and I think the main argument given for it is that “if not, why would we have all these proteins?” which I don’t find persuasive (since synapses are under a huge number of important constraints and serve many important functions beyond implementing computationally complex functions at inference time). I’ve seen zero candidates for useful purposes for such an incredible amount of local computation with negligible quantities of long-distance communication, and there are very few examples of human-designed computations structured in this way / it seems to involve an extremely implausible model of what neurons are doing (apparently some nearly-embarassingly parallelizable task with work concentrated in individual neurons?). I don’t really want to argue with this at length, but want to flag that you are very confident about it and it drives a large part of your estimate whereas something like 50-50 seems more appropriate even before updating on the empirical success of ML.
- In general you seem to be making the case very unnecessarily conjunctive—you are asking how likely it is that we will find algorithms as good as the brain, and then also build computers that operate at the Landauer limit (as you are apparently confident the brain does), and then also deploy AI in a way that is competitive at a $25/hour price point, and so on. But in fact one of these areas can outperform your benchmark (and if you are right in this section, then it’s definitely the case that we are radically more efficient than biology on many tasks already!), and it seems like you are dropping a lot of probability by ignoring that possibility. It’s like asking about the probability that a sum of 5 normal distributions will be above the mean, and estimating it’s 1/2^5 because each of 5 normal distributions needs to be above its mean.
(ETA: this criticism of section 3 is unfair: you do discuss the prospect of much better than human performance in the 2-page section “On the computational intensity of AGI,” and indeed this plays a completely central role in your bottom line estimate. But I’m still left wondering what the earlier 60% and 40% (and all the other numbers!) are supposed to represent, given that you are apparently putting all the work of “maybe humans will design efficient algorithms that are as good as the brain” in this section. You also don’t really discuss existing experience, where your estimates already appear to be many orders of magnitude off in domains where it is easiest to make comparisons between biology and ML (like vision or classical control) and where I don’t see how to argue we aren’t already 1000x better than biology using your 10 million flops per synapse number. Aside from me disagreeing with your mean, you describe these as conservative error bars since they put 20% probability on 1000x improvements over biology, but I think that’s really not the case given that it includes uncertainty about the useful compute done by the brain (where you already disagree by >>3 OOMs with plausible estimates) as well as algorithmic progress (where 1000x improvements over 20 years seem common both within software and ML).)
I’ll stop here rather than going on to sections 4+, though I think I have a lot to object to along similar lines (primarily that the story is being made unreasonably conjunctive).
Overall your estimation strategy looks crazy to me and I’m skeptical of the the implicit claim that this kind of methodology would perform well in historical examples. That said, if this sort of methodology actually does work well in practice then I think that trumps some a priori speculation and would be an important thing for me to really absorb. Your personal forecasting successes seem like a big part of the evidence for that, so it might be helpful to understand what kinds of predictions were involved and how methodologically analogous they are. Superficially it looks like the SciCast technology forecasting tournament is by far the most relevant; is there a pointer to the list of questions (other info like participants and list of predictions would also be awesome if available)? Or do you think one of the other items is more relevant?

ARC is hiring alignment theory researchers

Paul_Christiano14 Dec 2021 20:17 UTC

89 points

4 comments2 min readEA link

Altruistic equity allocation

Paul_Christiano16 Oct 2019 5:54 UTC

85 points

5 comments7 min readEA link

Paul_Christiano 8 Apr 2023 20:38 UTC
78 points
22 ∶ 1
on: GPTs are Predictors, not Imitators
I agree that it’s best to think of GPT as a predictor, to expect it to think in ways very unlike humans, and to expect it to become much smarter than a human in the limit.
That said, there’s an important further question that isn’t determined by the loss function alone—does the model do its most useful cognition in order to predict what a human would say, or via predicting what a human would say?
To illustrate, we can imagine asking the model to either (i) predict the outcome of a news story, (ii) predict a human thinking step-by-step about what will happen next in a news story. To the extent that (ii) is smarter than (i), it indicates that some significant part of the model’s cognitive ability is causally downstream of “predict what a human would say next,” rather than being causally upstream of it. The model has learned to copy useful cognitive steps performed by humans, which produce correct conclusions when executed by the model for the same reasons they produce correct conclusions when executed by humans.
(In fact (i) is smarter than (ii) in some ways, because the model has a lot of tacit knowledge about news stories that humans lack, but (ii) is smarter than (i) in other ways, and in general having models imitate human cognitive steps seems like the most useful way to apply them to most economically relevant tasks.)
Of course in the limit it’s overdetermined that the model will be smart in order to predict what a human would say, and will have no use for copying along with the human’s steps except insofar as this gives it (a tiny bit of) additional compute. But I would expect to AI to be transformative well before approaching that limit, so that this will remain an empirical question.
GPT-4 is still not as smart as a human in many ways, but it’s naked mathematical truth that the task GPTs are being trained on is harder than being an actual human.
I don’t think this is totally meaningful. Getting perfect loss on the task of being GPT-4 is obviously much harder than being a human, and so gradient descent on its loss could produce wildly superhuman systems. But:
- Given that you can just keep doing better and better essentially indefinitely, and that GPT is not anywhere near the upper limit, talking about the difficulty of the task isn’t super meaningful.
- To the extent that GPT-4 and humans are both optimizing a loss function, getting a nearly perfect genetic fitness is probably harder than getting a nearly perfect log loss.
- Getting a GPT-4 level loss on GPT-4′s task is probably much easier than getting a human-level loss on the human task.
What links here?
- The case for ensuring that powerful AIs are controlled by ryan_greenblatt (LessWrong; 24 Jan 2024 16:11 UTC; 238 points)
- AI #7: Free Agency by Zvi (LessWrong; 13 Apr 2023 16:20 UTC; 33 points)

Paul_Christiano 15 Nov 2022 21:54 UTC
77 points
10 ∶ 3
on: We must be very clear: fraud in the service of effective altruism is unacceptable
Earlier this year ARC received a grant for $1.25M from the FTX foundation. We now believe that this money morally (if not legally) belongs to FTX customers or creditors, so we intend to return $1.25M to them.
It may not be clear how to do this responsibly for some time depending on how bankruptcy proceedings evolve, and if unexpected revelations change the situation (e.g. if customers and creditors are unexpectedly made whole) then we may change our decision. We’ll post an update here when we have a more concrete picture; in the meantime we will set aside the money and not spend it.
We feel this is a particularly straightforward decision for ARC because we haven’t spent most of the money and have other supporters happy to fill our funding gap. I think the moral question is more complex for organizations that have already spent the money, especially on projects that they wouldn’t have done if not for FTX, and who have less clear prospects for fundraising.
(Also posted on our website.)
What links here?
- Erich_Grunewald's comment on If you received FTX grant money you should return it by Thurgood (28 Nov 2022 21:51 UTC; 3 points)

Paul_Christiano 7 Sep 2020 16:27 UTC
69 points
0 ∶ 0
on: Does Economic History Point Toward a Singularity?
This would be an important update for me, so I’m excited to see people looking into it and to spend more time thinking about it myself.
High-level summary of my current take on your document:
- I agree that the 1AD-1500AD population data seems super noisy.
- Removing that data removes one of the datapoints supporting continuous acceleration (the acceleration between 10kBC − 1AD and 1AD-1500AD) and should make us more uncertain in general.
- It doesn’t have much net effect on my attitude towards continuous acceleration vs discontinuous jumps, this mostly pushes us back towards our prior.
- I’m not very moved by the other evidence/arguments in your doc.
Here’s how I would summarize the evidence in your document:
- Much historical data is made up (often informed by the author’s models of population dynamics), so we can’t use it to estimate historical growth. This seems like the key point.
- In particular, although standard estimates of growth from 1AD to 1500AD are significantly faster than growth between 10kBC and 1AD, those estimates are sensitive to factor-of-1.5 error in estimates of 1AD population, and real errors could easily be much larger than that.
- Population levels are very noisy (in addition to population measurement being noisy) making it even harder to estimate rates.
- Radiographic data often displays isolated periods of rapid growth from 10,000BC to 1AD and it’s possible that average growth rates were something like 2000 year doubling. So even if 500-2000 year doubling times are accurate from 1AD to 1500, those may not be a deviation from the preceding period.
- You haven’t looked into the claims people have made about growth from 100kya to 10kya, but given what we know about measurement error from 10kya to now, it seems like the 100kya-10kya data is likely to be way too noisy to say anything about.
Here’s my take in more detail:
- You are basically comparing “Series of 3 exponentials” to a hyperbolic growth model. I think our default simple hyperbolic growth model should be the one in David Roodman’s report (blog post), so I’m going to think about this argument as comparing Roodman’s model to a series of 3 noisy exponentials. In your doc you often dunk on an extremely low-noise version of hyperbolic growth but I’m mostly ignoring that because I absolutely agree that population dynamics are very noisy.
- It feels like you think 3 exponentials is the higher prior model. But this model has many more parameters to fit the data, and even ignoring that “X changes in 2 discontinuous jumps” doesn’t seem like it has a higher prior than “X goes up continuously but stochastically.” I think the only reason we are taking 3 exponentials seriously is because of the same kind of guesswork you are dismissive of, namely that people have a folk sense that the industrial revolution and agricultural revolutions were discrete changes. If we think those folk senses are unreliable, I think that continuous acceleration has the better prior. And at the very least we need to be careful about using all the extra parameters in the 3-exponentials model, since a model with 2x more parameters should fit the data much better.
- On top of that, the post-1500 data is fit terribly by the “3 exponentials” model. Given that continuous acceleration very clearly applies in the only regime where we have data you consider reliable, and given that it already seemed simpler and more motivated, it seems pretty clear to me that it should have the higher prior, and the only reason to doubt that is because of growth folklore. You can’t have it both ways in using growth folklore to promote this hypothesis to attention and then dismissing the evidence from growth folklore because it’s folklore.
- On the acceleration model, the periods from 1500-2000, 10kBC-1500, and “the beginning of history to 10kBC” are roughly equally important data (and if that hypothesis has higher prior I don’t think you can reject that framing). Changes within 10kBC − 1500 are maybe 1/6th of the evidence, and ¹⁄₃ of the relevant evidence for comparing “continuous acceleration” to “3 exponentials.” I still think it’s great to dig into one of these periods, but I don’t think it’s misleading to present this period as only ¹⁄₃ of the data on a graph.
- (Enough about priors, onto the data.)
- I think that the key claim is that the 1AD-1500AD data is mostly unreliable. Without this data, we have very little information about acceleration from 10kBC − 1500AD, since the main thing we actually knew was that 1AD-1500AD must have been faster than the preceding 10k years. I’d like to look into that more, but it looks super plausible to me that the noise is 2x or more for 1AD which is enough to totally kill any inference about growth rates. So provisionally I’m inclined to accept your view there.
- That basically removes 1 datapoint for the continuous acceleration story and I totally agree it should leave us more uncertain about what’s going on. That said, throwing out all the numbers from that period also removes one of the main quantitative datapoints against continuous acceleration [ETA: the other big one being the modern “great stagnation,” both of these are in the tails of the continuous acceleration story and are just in the middle of the constant exponentials in the 3-exponential story, though see Robin Hanson’s writeup to get a sense for what the series of exponentials view actually ends up looking like—it’s still surprised by the great stagnation], and comes much closer to leaving us with our priors + the obvious acceleration over longer periods + the obvious acceleration during the shorter period where we actually have data, which seem to all basically point in the same direction.
- Even taking the radiocarbon data as given I don’t agree with the conclusions you are drawing from that data. It feels like in each case you are saying “a 2-exponential model fits fine” but the 2 exponentials are always different. The actual events (either technological developments or climate change or population dynamics) that are being pointed to as pivotal aren’t the same across the different time series and so I think we should just be analyzing these without reference to those events (no suggestive dotted lines :) ). I spent some time doing this kind of curve fitting to various stochastic growth models and this basically looks to me like what individual realizations look like from such models—the extra parameters in “splice together two unrelated curves” let you get fine-looking fits even when we know that the underlying dynamics are continuous+stochastic.
- I currently don’t trust the population data coming from the radiocarbon dating. My current expectation is that after a deep dive I would not end up trusting the radiocarbon dating at all for tracking changes in the rate of population growth when the populations in question are changing how they live and what kinds of artifacts they make (from my perspective, that’s what happened with the genetics data, which wasn’t caveated so aggressively in the initial draft I reviewed). I’d love to hear from someone who actually knows about these techniques or has done a deep dive on these papers though.
- I think the only dataset that you should expect to provide evidence on its own is the China population time series. But even there if you just take rolling averages and allow for a reasonable level of noise I think the continuous acceleration story looks fine. E.g. I think if you compare David Roodman’s model with the piecewise exponential model (both augmented with measurement noise, and allowing you to choose noisy dynamics however you want for the exponential model), Roodman’s model is going to fit the data better despite having fewer free parameters. If that’s the case, I don’t think this time series can be construed as evidence against that model.
- I agree with the point that if growth is 0 before the agricultural revolution, rather than “small,” then that would undermine the continuous acceleration story. I think prior growth was probably slow but non-zero, and this document didn’t really update my view on that question.
What links here?

Paul_Christiano 26 Nov 2023 17:19 UTC
66 points
7 ∶ 0
on: Paper out now on creatine and cognitive performance
My understanding of the results: for the preregistered tasks you measured effects of 1 IQ point (for RAPM) and 2.5 IQ points (for BDS), with a standard error of ~2 IQ points. This gives weak evidence in favor of a small effect, and strong evidence against a large effect.
You weren’t able to measure a difference between vegetarians and omnivores. For the exploratory cognitive tasks you found no effect. (I don’t know if you’d expect those tests to be sensitive enough to notice such a small effect.)
At this point it seems a bit unlikely to me that there is a clinically significant effect, maybe I’d bet at 4:1 against the effect being >0.05 SD. That said I still think it would be worthwhile for someone to do a larger study that could detect a 0.1 SD effect, since that would be clinically significant and is very weakly suggested by this data (and would make supplementation worthwhile given how cheap it is).
(See also gwern’s meta-analysis.)

Paul_Christiano 13 Feb 2021 16:57 UTC
64 points
0 ∶ 0
on: [Link post] Are we approaching the singularity?
The relevant section is VII. Summarizing the six empirical tests:
1. You’d expect productivity growth to accelerate as you approach the singularity, but it is slowing.
2. The capital share should approach 100% as you approach the singularity. The share is growing, but at the slow rate of ~0.5%/year. At that rate it would take roughly 100 years to approach 100%.
3. Capital should get very cheap as you approach the singularity. But capital costs (outside of computers) are falling relatively slowly.
4. The total stock of capital should get large as you approach the singularity. In fact the stock of capital is slowly falling relative to output.
5. Information should become an increasingly important part of the capital stock as you approach the singularity. This share is increasing, but will also take >100 years to become dominant.
6. Wage grow should accelerate as you approach the singularity, but it is slowing.
I would group these into two basic classes of evidence:
- We aren’t getting much more productive, but that’s what a singularity is supposed to be all about.
- Capital and IT extrapolations are potentially compatible with a singularity, but only a timescale of 100+ years.
I’d agree that these seem like two points of evidence against singularity-soon, and I think that if I were going on outside-view economic arguments I’d probably be <50% singularity by 2100. (Though I’d still have a meaningful probability soon, and even at 100 years the prospect of a singularity would be one of the most important facts about the basic shape of the future.)
There are some more detailed aspects of the model that I don’t buy, e.g. the very high share of information capital and persistent slow growth of physical capital. But I don’t think they really affect the bottom line.

On Progress and Prosperity

Paul_Christiano15 Oct 2014 7:03 UTC

59 points

32 comments9 min readEA link

Certificates of impact

Paul_Christiano11 Nov 2014 5:22 UTC

54 points

41 comments8 min readEA link

Paul_Christiano 15 Sep 2019 22:46 UTC
53 points
0 ∶ 0
on: Are we living at the most influential time in history?
I think the outside view argument for acceleration deserves more weight. Namely:
- Many measures of “output” track each other reasonably closely: how much energy we can harness, how many people we can feed, GDP in modern times, etc.
- Output has grown 7-8 orders of magnitude over human history.
- The rate of growth has itself accelerated by 3-4 orders of magnitude. (And even early human populations would have seemed to grow very fast to an observer watching the prior billion years of life.)
- It’s pretty likely that growth will accelerate by another order of magnitude at some point, given that it’s happened 3-4 times before and faster growth seems possible.
- If growth accelerated by another order of magnitude, a hundred years would be enough time for 9 orders of magnitude of growth (more than has occurred in all of human history).
- Periods of time with more growth seem to have more economic or technological milestones, even if they are less calendar time.
- Heuristics like “the next X years are very short relative to history, so probably not much will happen” seem to have a very bad historical track record when X is enough time for lots of growth to occur, and so it seems like a mistake to call them the “outside view.”
- If we go a century without doubling of growth rates, it will be (by far) the most that output has ever grown without significant acceleration.
- Data is noisy and data modeling is hard, but it is difficult to construct a model of historical growth that doesn’t have a significant probability of massive growth within a century.
- I think the models that are most conservative about future growth are those where stable growth is punctuated by rapid acceleration during “revolutions” (with the agricultural acceleration around 10,000 years ago and the industrial revolution causing continuous acceleration from 1600-1900).
- On that model human history has had two revolutions, with about two orders of magnitude of growth between them, each of which led to >10x speedup of growth. It seems like we should have a significant probability (certainly >10%) of another revolution occurring within the next order of magnitude of growth, i.e. within the next century.

Ought: why it matters and ways to help

Paul_Christiano26 Jul 2019 1:56 UTC

52 points

5 comments5 min readEA link

Paul_Christiano 29 Nov 2022 2:44 UTC
52 points
8 ∶ 5
on: Why Giving What We Can recommends using expert-led charitable funds
If someone is strongly considering donating to a charitable fund, I think they should usually instead participate in a donor lottery up to say 5-10% of the annual money moved by that fund. If they win, they can spend more time deciding how to give (whether that means giving to the fund that they were considering, giving to a different fund, changing cause areas, supporting a charity directly, participating in a larger lottery, saving in a donor-advised fund, or doing something altogether different).
I’m curious how you feel about that advice. Obviously some donors won’t be comfortable with the idea of a donor lottery and they can continue to give directly. I personally remain very excited about the idea of donor lotteries and think it would be healthy for the EA community to use more extensively.
For example, I think it would be healthy if funds were accountable to a smaller number of randomly selected donors who had the time to investigate more deeply, rather than spending <10% as much time and being more likely to pick based on a quick skim of fund materials and advertising/social dynamics/etc. And it seems like there’s no way to escape from that regress by having GWWC evaluate evaluators, since then the donor must evaluate GWWC’s evaluations. From this perspective a donor lottery is really like a “free lunch” that’s hard to get in other ways.
Using a fund is similar to using an actively managed investment fund instead of trying to pick individual stocks to invest in: in both cases, you let experts decide what to do with your money. This analogy helps explain the structure of a charitable fund, but it likely understates its benefits.
There is also one major way in which it overstates the benefits: for financial investments it is very valuable to diversify across at least dozens of firms and a few asset classes. Evaluating so many investments would take a huge amount of time, and so even if evaluating individual investments was easier than evaluating funds you’d still probably want to invest in a fund. In contrast, a charitable donor needs to find just one charity that they want to support, and so the case for evaluators really rests on it being easier to evaluate an evaluator than to evaluate a charity.
That comparison is most favorable for organizations like GiveWell, whose main role is to produce reasoning that would clearly be valuable to an individual donor trying to evaluate a charity. But “evaluate funds” vs “evaluate charities” is more apples-to-apples when you are primarily relying on funder judgment, since you could just as well rely on the judgment of people who run the charities they support.
(However the point about charities preferring to engage with fewer big funders is still very relevant and suggests using either a fund or a lottery.)

Paul_Christiano 19 Feb 2023 6:51 UTC
35 points
13 ∶ 2
on: Should ChatGPT make us downweight our belief in the consciousness of non-human animals?
It seems reasonable to guess that modern language models aren’t conscious in any morally relevant sense. But it seems odd to use that as the basis for a reductio of arguments about consciousness, given that we know nothing about the consciousness of language models.
Put differently: if a line of reasoning would suggest that language models are conscious, then I feel like the main update should be about consciousness of language models rather than about the validity of the line of reasoning. If you think that e.g. fish are conscious based on analysis of their behavior rather than evolutionary analogies with humans, then I think you should apply the same reasoning to ML systems.
I don’t think that biological brains are plausibly necessary for consciousness. It seems extremely likely to me that a big neural network can in principle be conscious without adding any of these bells or whistles, and it seems clear that SGD could find conscious models.
I don’t think the fact that language models say untrue things show they have no representation of the world (in fact for a pre-trained model that would be a clearly absurd inference—they are trained to predict what someone else would say and then sample from that distribution, which will of course lead to confidently saying false things when the predicted-speaker can know things the model does not!)
That all said, I think it’s worth noting and emphasizing that existing language models’ statements about their own consciousness are not evidence that they are conscious, and that more generally the relationship between a language model’s inner life and its utterances is completely unlike the relationship between a human’s inner life and their utterances (because they are trained to produce these utterances by mimicking humans, and they would make similar utterances regardless of whether they are conscious). A careful analysis of how models generalize out of distribution, or about surprisingly high accuracy on some kinds of prediction tasks could provide evidence of consciousness, but we don’t have that kind of evidence right now.