You say that “using digital error correction, it would be extremely unlikely that errors would be introduced even across millions or billions of years. (See section 4.2.) ” But that’s not entirely obvious to me from section 4.2. I understand that error correction is qualitatively very efficient, as you say, in that the probability of an error being introduced per unit time can be made as low as you like at the cost of only making the string of bits a certain small-seeming multiple longer (and my understanding is that multiple shrinks the longer the original string was?). But for any multiple, there’s some period of time long enough that the probability of faithfully maintaining some string of bits for that long is low. Is there any chance you could offer an estimate of, say, how much longer you’d have to make a petabyte in order to get the probability of an error over a billion years below 1%?
This is a great question. I think the answer depends on the type of storage you’re doing.
If you have a totally static lump of data that you want to encode in a harddrive and not touch for a billion years, I think the challenge is mostly in designing a type of storage unit that won’t age. Digital error correction won’t help if your whole magnetism-based harddrive loses its magnetism. I’m not sure how hard this is.
But I think more realistically, you want to use a type of hardware that you regularly use, regularly service, and where you can copy the information to a new harddrive when one is about to fail. So I’ll answer the question in that context.
As an error rate, let’s use the failure rate of 3.7e-9 per byte per month ~= 1.5e-11 per bit per day from this stack overflow reply. (It’s for RAM, which I think is more volatile than e.g. SSD storage, and certainly not optimised for stability, so you could probably get that down a lot.)
Let’s use the following as an error correction method: Each bit is represented by N bits; for any computation the computer does, it will use the majority vote of the N bits; and once per day,[1] each bit is reset to the majority vote of its group of bits.
If so...
for N=1, the probability that a bit is stable for 1e9 years is ~exp(-1.5e-11*365*1e9)=0.4%. Yikes!
for N=3, the probability that 2 bit flips happen in a single day is ~3*(1.5e-11)^2 and so the probability that a group of bits is stable for 1e9 years is ~exp(-3*(1.5e-11)^2*365*1e9)=1-2e-10. Much better, but there will probably still be a million errors in that petabyte of data.
for N=5, the probability that 3 bit flips happen in a single day is ~(5 choose 2)*(1.5e-11)^3 and so the probability that the whole petabyte of data is safe for 1e9 years is ~99.99%. And so on this scheme, it seems that 5 petabytes of storage is enough to make 1 petabyte stable for a billion years.
Based on the discussion here, I think the errors in doing the majority-voting calculations are negligible compared to the cosmic ray calculations. At least if you do it cleverly so that you don’t get too many correlations and ruin your redundance (which there are ways to do according to results on error correcting computations — though I’m not sure if they might require some fixed amount of extra storage space to do this, in which case you might need N somewhat greater than 5).
Now this scheme requires that you have a functioning civilization that can provide electricity for the computer, that can replace the hardware when it starts failing, and stuff — but that’s all things that we wanted to have anyway. And any essential component of that civilization can run on similarly error-corrected hardware.
And to account for larger-scale problems than cosmic rays (e.g. local earthquake throws harddrive to the ground and shatters it, or you accidentally erase a file when you were supposed to make a copy of it), you’d probably want backup copies of the petabyte on different places across the Earth, which you replaced each time something happened to one of them. If there’s an 0.1% chance of that happening in any one day (corresponding to once/3 years, which seems like an overestimate if you’re careful), and you immediately notice it and replace the copy within a day, and you have 5 copies in total, the probability that one of them keeps working at all times is ~exp(-(0.001)^5*365*1e9)~=99.96%. So combined with the previous 5, that’d be a multiple of 5*5=25.
This felt enlightening. I’ll add a link to this comment from the doc.
Using a day here rather than an hour or a month isn’t super-motivated. If you reset things very frequently, you might interfere with normal use of the computer, and errors in the resetting-operation might start to dominate the errors from cosmic rays. But I think a day should be above the threshold where that’s much of an issue.
This is super speculative of course, but if the future involves competition between different civilizations / value systems, do you think having to devote say 96% (i.e. 24⁄25) of a civilization’s storage capacity to redundancy would significantly weaken its fitness? I guess it would depend on what fraction of total resources are spent on information storage...?
Also, by the same token, even if there is a “singleton” at some relatively early time, mightn’t it prefer to take on a non-negligible risk of value drift later in time if it means being able to, say, 10x its effective storage capacity in the meantime?
(I know your 24⁄25 was a conservative estimate in some ways; on the other hand it only addresses the first billion years, which is arguably only a small fraction of the possible future, so hopefully it’s not too biased a number to anchor on!)
Depends on how much of their data they’d have to back up like this. If every bit ever produced or operated on instead had to be be 25 bits — that seems like a big fitness hit. But if they’re only this paranoid about a few crucial files (e.g. the minds of a few decision-makers), then that’s cheap.
And there’s another question about how much stability contributes to fitness. In humans, cancer tends to not be great for fitness. Analogously, it’s possible that most random errors in future civilizations would look less like slowly corrupting values and more like a coordinated whole splintering into squabbling factions that can easily be conquered by a unified enemy. If so, you might think that an institution that cared about stopping value-drift and an institution that didn’t would both have a similarly large interest in preventing random errors.
Also, by the same token, even if there is a “singleton” at some relatively early time, mightn’t it prefer to take on a non-negligible risk of value drift later in time if it means being able to, say, 10x its effective storage capacity in the meantime?
The counter-argument is that it will be super rich regardless, so it seems like satiable value systems would be happy to spend a lot on preventing really bad events from happening with small probability. Whereas instabiable value systems would notice that most resources are in the cosmos, and so also be obsessed with avoiding unwanted value drift. But yeah, if the values contain a pure time preference, and/or doesn’t care that much about the most probable types of value drift, then it’s possible that they wouldn’t deem the investment worth it.
Thanks, great post!
You say that “using digital error correction, it would be extremely unlikely that errors would be introduced even across millions or billions of years. (See section 4.2.) ” But that’s not entirely obvious to me from section 4.2. I understand that error correction is qualitatively very efficient, as you say, in that the probability of an error being introduced per unit time can be made as low as you like at the cost of only making the string of bits a certain small-seeming multiple longer (and my understanding is that multiple shrinks the longer the original string was?). But for any multiple, there’s some period of time long enough that the probability of faithfully maintaining some string of bits for that long is low. Is there any chance you could offer an estimate of, say, how much longer you’d have to make a petabyte in order to get the probability of an error over a billion years below 1%?
This is a great question. I think the answer depends on the type of storage you’re doing.
If you have a totally static lump of data that you want to encode in a harddrive and not touch for a billion years, I think the challenge is mostly in designing a type of storage unit that won’t age. Digital error correction won’t help if your whole magnetism-based harddrive loses its magnetism. I’m not sure how hard this is.
But I think more realistically, you want to use a type of hardware that you regularly use, regularly service, and where you can copy the information to a new harddrive when one is about to fail. So I’ll answer the question in that context.
As an error rate, let’s use the failure rate of 3.7e-9 per byte per month ~= 1.5e-11 per bit per day from this stack overflow reply. (It’s for RAM, which I think is more volatile than e.g. SSD storage, and certainly not optimised for stability, so you could probably get that down a lot.)
Let’s use the following as an error correction method: Each bit is represented by N bits; for any computation the computer does, it will use the majority vote of the N bits; and once per day,[1] each bit is reset to the majority vote of its group of bits.
If so...
for N=1, the probability that a bit is stable for 1e9 years is ~exp(-1.5e-11*365*1e9)=0.4%. Yikes!
for N=3, the probability that 2 bit flips happen in a single day is ~3*(1.5e-11)^2 and so the probability that a group of bits is stable for 1e9 years is ~exp(-3*(1.5e-11)^2*365*1e9)=1-2e-10. Much better, but there will probably still be a million errors in that petabyte of data.
for N=5, the probability that 3 bit flips happen in a single day is ~(5 choose 2)*(1.5e-11)^3 and so the probability that the whole petabyte of data is safe for 1e9 years is ~99.99%. And so on this scheme, it seems that 5 petabytes of storage is enough to make 1 petabyte stable for a billion years.
Based on the discussion here, I think the errors in doing the majority-voting calculations are negligible compared to the cosmic ray calculations. At least if you do it cleverly so that you don’t get too many correlations and ruin your redundance (which there are ways to do according to results on error correcting computations — though I’m not sure if they might require some fixed amount of extra storage space to do this, in which case you might need N somewhat greater than 5).
Now this scheme requires that you have a functioning civilization that can provide electricity for the computer, that can replace the hardware when it starts failing, and stuff — but that’s all things that we wanted to have anyway. And any essential component of that civilization can run on similarly error-corrected hardware.
And to account for larger-scale problems than cosmic rays (e.g. local earthquake throws harddrive to the ground and shatters it, or you accidentally erase a file when you were supposed to make a copy of it), you’d probably want backup copies of the petabyte on different places across the Earth, which you replaced each time something happened to one of them. If there’s an 0.1% chance of that happening in any one day (corresponding to once/3 years, which seems like an overestimate if you’re careful), and you immediately notice it and replace the copy within a day, and you have 5 copies in total, the probability that one of them keeps working at all times is ~exp(-(0.001)^5*365*1e9)~=99.96%. So combined with the previous 5, that’d be a multiple of 5*5=25.
This felt enlightening. I’ll add a link to this comment from the doc.
Using a day here rather than an hour or a month isn’t super-motivated. If you reset things very frequently, you might interfere with normal use of the computer, and errors in the resetting-operation might start to dominate the errors from cosmic rays. But I think a day should be above the threshold where that’s much of an issue.
Cool, thanks for thinking this through!
This is super speculative of course, but if the future involves competition between different civilizations / value systems, do you think having to devote say 96% (i.e. 24⁄25) of a civilization’s storage capacity to redundancy would significantly weaken its fitness? I guess it would depend on what fraction of total resources are spent on information storage...?
Also, by the same token, even if there is a “singleton” at some relatively early time, mightn’t it prefer to take on a non-negligible risk of value drift later in time if it means being able to, say, 10x its effective storage capacity in the meantime?
(I know your 24⁄25 was a conservative estimate in some ways; on the other hand it only addresses the first billion years, which is arguably only a small fraction of the possible future, so hopefully it’s not too biased a number to anchor on!)
Depends on how much of their data they’d have to back up like this. If every bit ever produced or operated on instead had to be be 25 bits — that seems like a big fitness hit. But if they’re only this paranoid about a few crucial files (e.g. the minds of a few decision-makers), then that’s cheap.
And there’s another question about how much stability contributes to fitness. In humans, cancer tends to not be great for fitness. Analogously, it’s possible that most random errors in future civilizations would look less like slowly corrupting values and more like a coordinated whole splintering into squabbling factions that can easily be conquered by a unified enemy. If so, you might think that an institution that cared about stopping value-drift and an institution that didn’t would both have a similarly large interest in preventing random errors.
The counter-argument is that it will be super rich regardless, so it seems like satiable value systems would be happy to spend a lot on preventing really bad events from happening with small probability. Whereas instabiable value systems would notice that most resources are in the cosmos, and so also be obsessed with avoiding unwanted value drift. But yeah, if the values contain a pure time preference, and/or doesn’t care that much about the most probable types of value drift, then it’s possible that they wouldn’t deem the investment worth it.