My own quick takeaway is that it takes 5-8 layers with about 1000 neurons in total in an artificial neural network to simulate a single biological neuron of a certain kind, and before taking this into account, we’d likely underestimate the computational power of animal brains relative to artificial neural networks, possibly up to about 1000x.
This does not seem right to me. I haven’t read the paper yet, so maybe I’m totally misunderstanding things, but...
The bio anchors framework does not envision us achieving AGI/TAI/etc. by simulating the brain, or even by simulating neurons. Instead, it tries to guesstimate how many artificial neurons or parameters we’d need to achieve similar capabilities to the brain, by looking at how many biological neurons or synapses are used in the brain, and then adding a few orders of magnitude of error bars. See the Carlsmith report, especially the conclusion summary diagram. Obviously if we actually wanted to simulate the brain we’d need to do something more sophisticated than just use 1 artificial neuron per biological neuron. For a related post, see this. Anyhow, the point is, this paper seems almost completely irrelevant to the bio anchors framework, because we knew already (and other papers had shown) that if we wanted to simulate a neuron it would take more than just one artificial neuron.
Assuming I’m wrong about point #1, I think the calculation would be more complex than just “1000 artificial neurons needed per biological neuron, so +3 OOMs to bio anchors framework.” Most of the computation in the bio anchors calculation comes from synapses, not neurons. Here’s an attempt at how the revised calculation might go:
Currently Carlsmith’s median estimate is 10^15 flop per second. Ajeya’s report guesses that artificial stuff is crappier than bio stuff and so uses 10e16 flop per second as the median instead, IIRC.
There are 10e11 neurons in the brain, and 10^14-10^15 synapses.
If we assume each neuron requires an 8-layer convolutional DNN with 1000 neurons… how many parameters is that? Let’s say it’s 100,000, correct me if I’m wrong.
So then that would be 100,000 flop per period of neuron-simulation.
I can’t access the paper itself but one of the diagrams says something about one ms of input. So that means maybe that the period length is 1 ms, which means 1000 periods a second, which means 100,000,000 flop per second of neuron-simulation.
This would be a lot more than the cost of simulating the synapses, so we don’t have to bother calculating that.
So our total cost is 10^8 flop per second per neuron times 10^11 neurons = 10^17 flop per second to simulate the brain.
So this means a loose upper bound for the bio anchors framework should be at 10^17, whereas currently Ajeya uses a median of 10^16 with a few OOMs of uncertainty on either side. It also means, insofar as you think my point #1 is wrong and that this paper is the last word on the subject, that the median should maybe be 10^17 as well, though that’s less clear. (Plausibly we’ll be able to find more efficient ways to simulate neurons than the dumb 8-layer NN they tried in this paper, shaving an OOM or so off the cost, bringing us back down to 10^16again...)
It’s unclear whether this would lengthen or shorten timelines, I’d like to see the calculations for that. My wild guess is that it would lower the probability of <15 year timelines and also lower the probability of >30 year timelines.
On point 1, my claim is that the paper is evidence for the claim that biological neurons are more computationally powerful than artificial ones, not that we’d achieve AGI/TAI by simulating biological brains. I agree that for those who already expected this, this paper wouldn’t be much of an update (well, maybe the actual numbers matter; 1000x seemed pretty high, but is also probably an overestimate).
I also didn’t claim that the timelines based on biological anchors that I linked to would actually be affected by this (since I didn’t know either way whether they made any adjustment for this, since I only read summaries and may have skimmed a few parts of the actual report), but that’s a totally reasonable interpretation of what I said, and I should have been more careful to prevent it from being interpreted that way.
What does it mean to say a biological neuron is more computationally powerful than an artificial one? If all it means is that it takes more computation to fully simulate its behavior, then by that standard a leaf falling from a tree is more computationally powerful than my laptop. (This is a genuine question, not a rhetorical one. I do have some sense of what you are saying but it’s fuzzy in my head and I’m wondering if you have a more precise definition that isn’t just “computation required to simulate.” I suspect that the Carlsmith report I linked may have already answered this question and I forgot what it said.)
I would say a biological neuron can compute more complex functions or a wider variety of functions of its inputs than standard artificial neurons in deep learning (linear combination of inputs followed by a nonlinear real-valued function with one argument), and you could approximate functions of interest with fewer biological neurons than artificial ones. Maybe biological neurons have more (useable) degrees of freedom for the same number of input connections.
I think I get it, thanks! (What follows is my understanding, please correct if wrong!) The idea is something like: A falling leaf is not a computer, it can’t be repurposed to perform many different useful computations. But a neuron is; depending on the weights of its synapses it can be an and gate, an or gate, or various more complicated things. And this paper in the OP is evidence that the range of more complicated useful computations it can do is quite large, which is reason to think that in maybe in the relevant sense a lot of the brain’s skills have to involve fancy calculations within neurons. (Just because they do doesn’t mean they have to, but if neurons are general-purpose computers capable of doing lots of computations, that seems like evidence compared to if neurons were more like falling leaves)
I still haven’t read the paper—does the experiment distinguish between the “it’s a tiny computer” hypothesis vs. the “it’s like a falling leaf—hard to simulate, but not in an interesting way” hypothesis?
Ya, this is what I’m thinking, although have to is also a matter of scaling, e.g. a larger brain could accomplish the same with less powerful neurons. There’s also probably a lot of waste in the human brain, even just among the structures most important for reasoning (although the same could end up being true or an AGI/TAI we try to build; we might need a lot of waste before we can prune or make smaller student networks, etc.).
On falling leaves, the authors were just simulating the input and output behaviour of the neurons, not the physics/chemistry/biology (I’m not sure if that’s what you had in mind), but based on the discussion on this post, the 1000x could be very misleading and could mostly go away as you scale to try to simulate a larger biological network, or you could have a similar cost in trying to simulate an artificial neural network with a biological one. They didn’t check for these possibilities (so it could still be in some sense like simulating falling leaves).
Still, 1000x seems high to me for biological neurons not being any more powerful than artificial neurons, although this is pretty much just gut intuition, and I can’t really explain why. Based on the conversations here (with you and others), I think 10x is a reasonable guess.
What I meant by the falling leaf thing: If we wanted to accurately simulate where a leaf would land when dropped from a certain height and angle, it would require a ton of complex computation. But (one can imagine) it’s not necessary for us to do this; for any practical purpose we can just simplify it to a random distribution centered directly below the leaf with variance v.
Similarly (perhaps) if we want to accurately simulate the input-output behavior of a neuron, maybe we need 8 layers of artificial neurons. But maybe in practice if we just simplified it to “It sums up the strength of all the neurons that fired at it in the last period, and then fires with probability p, where p is an s-curve function of the strength sum...” maybe that would work fine for practical purposes—NOT for purpose of accurately reproducing the human brain’s behavior, but for purposes of building an approximately brain-sized artificial neural net that is able to learn and excel at the same tasks.
My original point no. 1 was basically that I don’t see how the experiment conducted in this paper is much evidence against the “simplified model would work fine for practical purposes” hypothesis.
Ya, that’s fair. If this is the case, I might say that the biological neurons don’t have additional useful degrees of freedom for the same number of inputs, and the paper didn’t explicitly test for this either way, although, imo, what they did test is weak Bayesian evidence for biological neurons having more useful degrees of freedom, since if they could be simulated with few artificial neurons, we could pretty much rule out that hypothesis. Maybe this evidence is too weak to update much on, though, especially if you had a prior that simulating biological neurons would be pretty hard even if they had no additional useful degrees of freedom.
Now I think we are on the same page. Nice! I agree that this is weak bayesian evidence for the reason you mention; if the experiment had discovered that one artificial neuron could adequately simulate one biological neuron, that would basically put an upper bound on things for purposes of the bio anchors framework (cutting off approximately the top half of Ajeya’s distribution over required size of artificial neural net). Instead they found that you need thousands. But (I would say) this is only weak evidence because prior to hearing about this experiment I would have predicted that it would be difficult to accurately simulate a neuron, just as it’s difficult to accurately simulate a falling leaf. Pretty much everything that happens in biology is complicated and hard to simulate.
This does not seem right to me. I haven’t read the paper yet, so maybe I’m totally misunderstanding things, but...
The bio anchors framework does not envision us achieving AGI/TAI/etc. by simulating the brain, or even by simulating neurons. Instead, it tries to guesstimate how many artificial neurons or parameters we’d need to achieve similar capabilities to the brain, by looking at how many biological neurons or synapses are used in the brain, and then adding a few orders of magnitude of error bars. See the Carlsmith report, especially the conclusion summary diagram. Obviously if we actually wanted to simulate the brain we’d need to do something more sophisticated than just use 1 artificial neuron per biological neuron. For a related post, see this. Anyhow, the point is, this paper seems almost completely irrelevant to the bio anchors framework, because we knew already (and other papers had shown) that if we wanted to simulate a neuron it would take more than just one artificial neuron.
Assuming I’m wrong about point #1, I think the calculation would be more complex than just “1000 artificial neurons needed per biological neuron, so +3 OOMs to bio anchors framework.” Most of the computation in the bio anchors calculation comes from synapses, not neurons. Here’s an attempt at how the revised calculation might go:
Currently Carlsmith’s median estimate is 10^15 flop per second. Ajeya’s report guesses that artificial stuff is crappier than bio stuff and so uses 10e16 flop per second as the median instead, IIRC.
There are 10e11 neurons in the brain, and 10^14-10^15 synapses.
If we assume each neuron requires an 8-layer convolutional DNN with 1000 neurons… how many parameters is that? Let’s say it’s 100,000, correct me if I’m wrong.
So then that would be 100,000 flop per period of neuron-simulation.
I can’t access the paper itself but one of the diagrams says something about one ms of input. So that means maybe that the period length is 1 ms, which means 1000 periods a second, which means 100,000,000 flop per second of neuron-simulation.
This would be a lot more than the cost of simulating the synapses, so we don’t have to bother calculating that.
So our total cost is 10^8 flop per second per neuron times 10^11 neurons = 10^17 flop per second to simulate the brain.
So this means a loose upper bound for the bio anchors framework should be at 10^17, whereas currently Ajeya uses a median of 10^16 with a few OOMs of uncertainty on either side. It also means, insofar as you think my point #1 is wrong and that this paper is the last word on the subject, that the median should maybe be 10^17 as well, though that’s less clear. (Plausibly we’ll be able to find more efficient ways to simulate neurons than the dumb 8-layer NN they tried in this paper, shaving an OOM or so off the cost, bringing us back down to 10^16again...)
It’s unclear whether this would lengthen or shorten timelines, I’d like to see the calculations for that. My wild guess is that it would lower the probability of <15 year timelines and also lower the probability of >30 year timelines.
On point 1, my claim is that the paper is evidence for the claim that biological neurons are more computationally powerful than artificial ones, not that we’d achieve AGI/TAI by simulating biological brains. I agree that for those who already expected this, this paper wouldn’t be much of an update (well, maybe the actual numbers matter; 1000x seemed pretty high, but is also probably an overestimate).
I also didn’t claim that the timelines based on biological anchors that I linked to would actually be affected by this (since I didn’t know either way whether they made any adjustment for this, since I only read summaries and may have skimmed a few parts of the actual report), but that’s a totally reasonable interpretation of what I said, and I should have been more careful to prevent it from being interpreted that way.
What does it mean to say a biological neuron is more computationally powerful than an artificial one? If all it means is that it takes more computation to fully simulate its behavior, then by that standard a leaf falling from a tree is more computationally powerful than my laptop.
(This is a genuine question, not a rhetorical one. I do have some sense of what you are saying but it’s fuzzy in my head and I’m wondering if you have a more precise definition that isn’t just “computation required to simulate.” I suspect that the Carlsmith report I linked may have already answered this question and I forgot what it said.)
I would say a biological neuron can compute more complex functions or a wider variety of functions of its inputs than standard artificial neurons in deep learning (linear combination of inputs followed by a nonlinear real-valued function with one argument), and you could approximate functions of interest with fewer biological neurons than artificial ones. Maybe biological neurons have more (useable) degrees of freedom for the same number of input connections.
I think I get it, thanks! (What follows is my understanding, please correct if wrong!) The idea is something like: A falling leaf is not a computer, it can’t be repurposed to perform many different useful computations. But a neuron is; depending on the weights of its synapses it can be an and gate, an or gate, or various more complicated things. And this paper in the OP is evidence that the range of more complicated useful computations it can do is quite large, which is reason to think that in maybe in the relevant sense a lot of the brain’s skills have to involve fancy calculations within neurons. (Just because they do doesn’t mean they have to, but if neurons are general-purpose computers capable of doing lots of computations, that seems like evidence compared to if neurons were more like falling leaves)
I still haven’t read the paper—does the experiment distinguish between the “it’s a tiny computer” hypothesis vs. the “it’s like a falling leaf—hard to simulate, but not in an interesting way” hypothesis?
Ya, this is what I’m thinking, although have to is also a matter of scaling, e.g. a larger brain could accomplish the same with less powerful neurons. There’s also probably a lot of waste in the human brain, even just among the structures most important for reasoning (although the same could end up being true or an AGI/TAI we try to build; we might need a lot of waste before we can prune or make smaller student networks, etc.).
On falling leaves, the authors were just simulating the input and output behaviour of the neurons, not the physics/chemistry/biology (I’m not sure if that’s what you had in mind), but based on the discussion on this post, the 1000x could be very misleading and could mostly go away as you scale to try to simulate a larger biological network, or you could have a similar cost in trying to simulate an artificial neural network with a biological one. They didn’t check for these possibilities (so it could still be in some sense like simulating falling leaves).
Still, 1000x seems high to me for biological neurons not being any more powerful than artificial neurons, although this is pretty much just gut intuition, and I can’t really explain why. Based on the conversations here (with you and others), I think 10x is a reasonable guess.
What I meant by the falling leaf thing:
If we wanted to accurately simulate where a leaf would land when dropped from a certain height and angle, it would require a ton of complex computation. But (one can imagine) it’s not necessary for us to do this; for any practical purpose we can just simplify it to a random distribution centered directly below the leaf with variance v.
Similarly (perhaps) if we want to accurately simulate the input-output behavior of a neuron, maybe we need 8 layers of artificial neurons. But maybe in practice if we just simplified it to “It sums up the strength of all the neurons that fired at it in the last period, and then fires with probability p, where p is an s-curve function of the strength sum...” maybe that would work fine for practical purposes—NOT for purpose of accurately reproducing the human brain’s behavior, but for purposes of building an approximately brain-sized artificial neural net that is able to learn and excel at the same tasks.
My original point no. 1 was basically that I don’t see how the experiment conducted in this paper is much evidence against the “simplified model would work fine for practical purposes” hypothesis.
Ya, that’s fair. If this is the case, I might say that the biological neurons don’t have additional useful degrees of freedom for the same number of inputs, and the paper didn’t explicitly test for this either way, although, imo, what they did test is weak Bayesian evidence for biological neurons having more useful degrees of freedom, since if they could be simulated with few artificial neurons, we could pretty much rule out that hypothesis. Maybe this evidence is too weak to update much on, though, especially if you had a prior that simulating biological neurons would be pretty hard even if they had no additional useful degrees of freedom.
Now I think we are on the same page. Nice! I agree that this is weak bayesian evidence for the reason you mention; if the experiment had discovered that one artificial neuron could adequately simulate one biological neuron, that would basically put an upper bound on things for purposes of the bio anchors framework (cutting off approximately the top half of Ajeya’s distribution over required size of artificial neural net). Instead they found that you need thousands. But (I would say) this is only weak evidence because prior to hearing about this experiment I would have predicted that it would be difficult to accurately simulate a neuron, just as it’s difficult to accurately simulate a falling leaf. Pretty much everything that happens in biology is complicated and hard to simulate.