It’s not clear to me that the marginal benefit over the Internet Archive is worth it. Even without published hashes it would still be super surprising for the IA to have backdated content.
This is good criticism, and I’m inclined to agree in part. I do not intend to argue that the marginal value is necessarily great, only that the expected marginal value is much greater than the cost. Here are a couple plausible but maybe less than 50% probability scenarios in which the timestamps can have significant impact on society:
Both western and eastern governments have implemented a good portion of the parts of Orwell’s vision that have so far been feasible, in particular mass espionage and forcing content deletion. Editing history in a convincing way has so far been less feasible, but AI might change that, and it isn’t clear why we should believe that no government that has a valuable public archive in their jurisdiction would contemplate doing so.
A journalist is investigating a corruption scandal with far-reaching consequences. Archives are important tools, but it just so happens that this one archive until recently had an employee that is connected to a suspect...
In order to prove your innocence, or someone else’s guilt, you need to verify and be able to prove what was privately emailed from your organization. Emails are not in the internet archive, but luckily your organization uses cryptographic timestamps for all emails and documents.
Preventing these secnarios requires both convincing large organizations to implement cryptographic timestamps and implementing these timestamps on every significant private archive. Even the implementation cannot be done on anything close to a five-figure budget; engineering time is way more expensive.
It’s also very very hard to convince middle managers who don’t understand cryptography to publicly post (hashes of) private data for no conceivable benefit.
My rough budget guess is probably off, as you say. For some reason I just looked at hardware and took a wide margin. For a grant application this has to be ironed out a lot more seriously.
I admit that popularizing the practice for private archives would take a significant effort far beyond a 5-digit budget. I envisioned doing this in collaboration with the internet archive as a first project to reap the most low-hanging fruits, and then hopefully it’d be less difficult to convince other archives to follow suit.
It’s worth noting that implementations, commercial services and public ledgers for time-stamping already exist. I imagine scaling, operationalizing and creating interfaces for consumption would be major parts of the project for the internet archive.
Assuming that this is both useful and time or funding constrained, you could be selective in how you roll it out. Images of world leaders and high profile public figures seem most likely to be manipulated and would have the highest negative impact if many people were fooled. You could start there
you could be selective in how you roll it out. Images of … high profile public figures seem most likely to be manipulated
Thank you, perhaps the first priority should be to quickly operationalize timestamping of newly created content at news organizations. Perhaps even before publication, if they can be convinced to do so.
That was my suspicion too. Similarly to how I can pretty easily find photos and video from the past that aren’t photoshopped, I suspect it won’t be all that difficult to collect text either.
That is fantastic, hopefully this indicates an organization open to ideas, and if they’ve been doing this for a while it might be worth “rescuing” those timestamps.
To what extent could this be implemented as an addition to the internet archive?
It might be advantageous to do so for content that is in the internet archive. For content that is not, especially non-public content, it might be more feasible to offer the solution as a public service + open source software.
To what extent could this be implemented as an addition to the internet archive?
It’s not clear to me that the marginal benefit over the Internet Archive is worth it. Even without published hashes it would still be super surprising for the IA to have backdated content.
This is good criticism, and I’m inclined to agree in part. I do not intend to argue that the marginal value is necessarily great, only that the expected marginal value is much greater than the cost. Here are a couple plausible but maybe less than 50% probability scenarios in which the timestamps can have significant impact on society:
Both western and eastern governments have implemented a good portion of the parts of Orwell’s vision that have so far been feasible, in particular mass espionage and forcing content deletion. Editing history in a convincing way has so far been less feasible, but AI might change that, and it isn’t clear why we should believe that no government that has a valuable public archive in their jurisdiction would contemplate doing so.
A journalist is investigating a corruption scandal with far-reaching consequences. Archives are important tools, but it just so happens that this one archive until recently had an employee that is connected to a suspect...
In order to prove your innocence, or someone else’s guilt, you need to verify and be able to prove what was privately emailed from your organization. Emails are not in the internet archive, but luckily your organization uses cryptographic timestamps for all emails and documents.
Preventing these secnarios requires both convincing large organizations to implement cryptographic timestamps and implementing these timestamps on every significant private archive. Even the implementation cannot be done on anything close to a five-figure budget; engineering time is way more expensive.
It’s also very very hard to convince middle managers who don’t understand cryptography to publicly post (hashes of) private data for no conceivable benefit.
Good criticism.
My rough budget guess is probably off, as you say. For some reason I just looked at hardware and took a wide margin. For a grant application this has to be ironed out a lot more seriously.
I admit that popularizing the practice for private archives would take a significant effort far beyond a 5-digit budget. I envisioned doing this in collaboration with the internet archive as a first project to reap the most low-hanging fruits, and then hopefully it’d be less difficult to convince other archives to follow suit.
It’s worth noting that implementations, commercial services and public ledgers for time-stamping already exist. I imagine scaling, operationalizing and creating interfaces for consumption would be major parts of the project for the internet archive.
Assuming that this is both useful and time or funding constrained, you could be selective in how you roll it out. Images of world leaders and high profile public figures seem most likely to be manipulated and would have the highest negative impact if many people were fooled. You could start there
Thank you, perhaps the first priority should be to quickly operationalize timestamping of newly created content at news organizations. Perhaps even before publication, if they can be convinced to do so.
That was my suspicion too. Similarly to how I can pretty easily find photos and video from the past that aren’t photoshopped, I suspect it won’t be all that difficult to collect text either.
Actually, the internet archive is achieving exactly this goal unintentionally for economic reasons—they’re uploading data onto Filecoin because it is so deeply subsidized by the block reward, which has the indirect effect of checkpointing the data in a cryptographically robust way. https://archive.devcon.org/archive/watch/6/universal-access-to-all-knowledge-decentralization-experiments-at-the-internet-archive
That is fantastic, hopefully this indicates an organization open to ideas, and if they’ve been doing this for a while it might be worth “rescuing” those timestamps.
It might be advantageous to do so for content that is in the internet archive. For content that is not, especially non-public content, it might be more feasible to offer the solution as a public service + open source software.