Time-stamping: An urgent, neglected AI safety measure

Axel SvenssonJan 30, 2023, 11:21 AM

57 points

Public interest technology Information security AI safety Software engineering Transparency

TL;DR

I believe we should use a 5-digit annual budget to create/serve trust-less, cryptographic timestamps of all public content, in order to significantly counteract the growing threat that AI-generated fake content poses to truth-seeking and trust. We should also encourage and help all organizations to do likewise with private content.

THE PROBLEM & SOLUTION

As the rate and quality of AI-generated content keeps increasing, it seems inevitable that it will become easier to create fake content and harder to verify/refute it. Remember the very recent past when faking a photo was so hard that simply providing a photo was considered proof? If we do nothing about it, these AI advances might have a devastating impact on people’s opportunities to trust both each other and historic material, and might end up having negative value for humanity on net.

I believe that trust-less time-stamping is an effective, urgent, tractable and cheap method to partly, but significantly so, counteract this lamentable development. Here’s why:

EFFECTIVE

It is likely that fake creation technology will outpace fake detection technology. If so, we will nominally end up in an indefinite state of having to doubt pretty much all content. However, with trust-less time-stamping, the contest instead becomes between the fake creation technology available at the time of timestamping, and the fake detection technology available at the time of truth-seeking.

Time-stamping everything today will protect all past and current content against suspicion of interference by all future fake creation technology. As both fake creation and fake detection technology progress, no matter at what relative pace, the value of timestamps will grow over time. Perhaps in a not so distant future, it will become an indispensable historical record.

URGENT

Need I say much about the pace of progress for AI technology, or the extent of existing content? The value of timestamping everything today rather than in one month, is some function of the value of the truth of all historical records and other content, and technological development during that time. I suspect there’s a multiplication somewhere in that function.

TRACTABLE

We already have the cryptographic technology and infrastructure to make trust-less timestamps. We also have large public archives of digital and/or digitized content, including but not limited to the web. Time-stamping all of it might not be trivial, but it’s not particularly hard. It can even be done without convincing very many people that it needs to be done. For non-public content, adding timestamping as a feature in backup software should be similarly tractable—here the main struggle will probably be to convince users of the value of timestamping.

Implementation: Each piece of content is hashed, the hashes put into a merkle tree, and the root of that tree published on several popular, secure, trust-less public ledgers. Proof of timestamp is produced as a list of hashes along the merkle branch from the content up to the root, together with transaction IDs. This technology, including implementations, services and public ledgers already exists. For private content, you might want to be able to prove a timestamp for one piece of content without divulging the existence of another piece of content. To do so, one would add one bottom level in the merkle tree where each content hash is hashed with a pseudo-random value rather than another content hash. This pseudo-random value can be produced from the content hash itself and a salt that is constant within an organization.

CHEAP

Timestamping n pieces of content comprising a total of b bytes will incur a one-time cost for processing on the order of O(b+n), a continuous cost for storage on the order of O(n), and a one-time cost for transactions on public, immutable ledgers on the order of O(1). Perhaps most significant is the storage. Using a hash function with h bytes of output, it’s actually possible to store it all in nh bytes. When in active use, you want to be able to produce proofs without excessive processing. In this usage mode, it would be beneficial to have 2nh bytes available.

Taking archive.org as an example, reasonable values are h=32 and n=7.8*10^11 [1], requiring 2nh ≈ 5.0*10^13 ≈ 50 TB of storage (not TiB, since we’re talking pricing). At current HDD prices in the 7.50-14.50 USD/TB range [2,3], that is a few hundred bucks. Add storage redundancy, labor, continuously adding timestamps and serving proofs etc., and we’re still talking about a 5-digit amount yearly.

SUMMARY

I have here presented an idea that I believe has a pretty good chance of significantly counteracting some negative consequences of emerging AI technology, using only a 5-digit yearly budget. Beyond effective, tractable and cheap, I have also argued for why I believe it is urgent, in the sense that vast amounts of future public good is lost for every month this is not done. I am not in a position to do this myself, and failed to raise awareness at HN [4], which arguable wasn’t the right forum. It seems to be an opportunity too good to pass on. Here’s hoping to find someone who cares enough to get it done, or explain why I’m wrong.

Footnotes

[1] https://archive.org/ tagline “Search the history of over 778 billion web pages on the Internet.”
[2] https://www.backblaze.com/blog/hard-drive-cost-per-gigabyte/
[3] https://diskprices.com/
[4] https://news.ycombinator.com/item?id=32817158

Axel SvenssonJan 30, 2023, 11:21 AM

57 points

27 comments3 min readEA link

Public interest technology Information security AI safety Software engineering Transparency

MathiasKB🔸Jan 30, 2023, 1:15 PM
13 points
2 ∶ 0

To what extent could this be implemented as an addition to the internet archive?
- Jeff Kaufman 🔸Jan 30, 2023, 1:57 PM
  23 points
  8 ∶ 1
  Parent
  
  It’s not clear to me that the marginal benefit over the Internet Archive is worth it. Even without published hashes it would still be super surprising for the IA to have backdated content.
  - Axel Svensson Jan 30, 2023, 7:56 PM
    3 points
    1 ∶ 0
    Parent
    
    This is good criticism, and I’m inclined to agree in part. I do not intend to argue that the marginal value is necessarily great, only that the expected marginal value is much greater than the cost. Here are a couple plausible but maybe less than 50% probability scenarios in which the timestamps can have significant impact on society:
    
    Both western and eastern governments have implemented a good portion of the parts of Orwell’s vision that have so far been feasible, in particular mass espionage and forcing content deletion. Editing history in a convincing way has so far been less feasible, but AI might change that, and it isn’t clear why we should believe that no government that has a valuable public archive in their jurisdiction would contemplate doing so.
    
    A journalist is investigating a corruption scandal with far-reaching consequences. Archives are important tools, but it just so happens that this one archive until recently had an employee that is connected to a suspect...
    
    In order to prove your innocence, or someone else’s guilt, you need to verify and be able to prove what was privately emailed from your organization. Emails are not in the internet archive, but luckily your organization uses cryptographic timestamps for all emails and documents.
    - Eigengender Jan 30, 2023, 9:05 PM
      5 points
      3 ∶ 0
      Parent
      
      Preventing these secnarios requires both convincing large organizations to implement cryptographic timestamps and implementing these timestamps on every significant private archive. Even the implementation cannot be done on anything close to a five-figure budget; engineering time is way more expensive.
      It’s also very very hard to convince middle managers who don’t understand cryptography to publicly post (hashes of) private data for no conceivable benefit.
      - Axel Svensson Jan 30, 2023, 10:13 PM
        3 points
        0 ∶ 0
        Parent
        
        Good criticism.
        
        My rough budget guess is probably off, as you say. For some reason I just looked at hardware and took a wide margin. For a grant application this has to be ironed out a lot more seriously.
        
        I admit that popularizing the practice for private archives would take a significant effort far beyond a 5-digit budget. I envisioned doing this in collaboration with the internet archive as a first project to reap the most low-hanging fruits, and then hopefully it’d be less difficult to convince other archives to follow suit.
        
        It’s worth noting that implementations, commercial services and public ledgers for time-stamping already exist. I imagine scaling, operationalizing and creating interfaces for consumption would be major parts of the project for the internet archive.
      - Matt Goodman Jan 31, 2023, 1:21 PM
        1 point
        0 ∶ 0
        Parent
        
        Assuming that this is both useful and time or funding constrained, you could be selective in how you roll it out. Images of world leaders and high profile public figures seem most likely to be manipulated and would have the highest negative impact if many people were fooled. You could start there
        Axel Svensson Jan 31, 2023, 2:12 PM
        1 point
        0 ∶ 0
        Parent
        
        you could be selective in how you roll it out. Images of … high profile public figures seem most likely to be manipulated
        Thank you, perhaps the first priority should be to quickly operationalize timestamping of newly created content at news organizations. Perhaps even before publication, if they can be convinced to do so.
  - MathiasKB🔸Jan 30, 2023, 2:05 PM
    3 points
    0 ∶ 0
    Parent
    
    That was my suspicion too. Similarly to how I can pretty easily find photos and video from the past that aren’t photoshopped, I suspect it won’t be all that difficult to collect text either.
- emiyazono Jan 31, 2023, 6:27 AM
  5 points
  1 ∶ 0
  Parent
  
  Actually, the internet archive is achieving exactly this goal unintentionally for economic reasons—they’re uploading data onto Filecoin because it is so deeply subsidized by the block reward, which has the indirect effect of checkpointing the data in a cryptographically robust way. https://archive.devcon.org/archive/watch/6/universal-access-to-all-knowledge-decentralization-experiments-at-the-internet-archive
  - Axel Svensson Jan 31, 2023, 8:16 AM
    1 point
    0 ∶ 0
    Parent
    
    That is fantastic, hopefully this indicates an organization open to ideas, and if they’ve been doing this for a while it might be worth “rescuing” those timestamps.
- Axel Svensson Jan 30, 2023, 8:00 PM
  2 points
  0 ∶ 0
  Parent
  
  
  To what extent could this be implemented as an addition to the internet archive?
  
  It might be advantageous to do so for content that is in the internet archive. For content that is not, especially non-public content, it might be more feasible to offer the solution as a public service + open source software.
plex Jan 30, 2023, 1:05 PM
9 points
2 ∶ 0

Seems worthwhile and good for an extra reason: It allows us to train AIs on only data from before there was significant AI generated content, which might mitigate some safety concerns around AIs influencing future AIs training.
- Madhav Malhotra Jan 31, 2023, 11:57 AM
  1 point
  0 ∶ 0
  Parent
  
  For anyone seeking more information on this, feel free to search up the key terms ‘data poisoning’ and ‘Trojans.’ The Centre for AI Safety has somewhat accessible content and notes on this under lecture 13 here.
Tier 1 Longtermist Jan 31, 2023, 2:27 AM
6 points
0 ∶ 0

This post was well written and well structured. You are talented and put a lot of effort into this!
Related to what others have commented, I think the cost of this is much higher than what you suggest, e.g. your guess of cost using HDD storage cost and a multiplier seems crude.
- In addition to dev time as someone else mentioned, I think the compute/network for the hash (going through literally all content) seems large, possibly multiple orders of magnitude more than the cost implied here.
Also, I’m unsure, because I haven’t fully thought through your post, but some other thoughts:
- Isn’t there really some sort of giant coordination problem here, it seems like you need to have a large fraction of all content timestamped in this way?
  - If 20% of content remain not timestamped, do issues about credibility of content remain?
- Something beyond hashing is valuable and important? (I probably need to think about this more, this could be completely wrong)
  - There’s probably content with many tiny variations and it’s better to group that content together?
    E.g., changing a few pixels or frames on a video alters the hash, instead you want some marker maps closer to the “content”
  - Finding the algorithm/implementation to do this seems important but also orders of magnitude more costly?
- Axel Svensson Jan 31, 2023, 9:11 AM
  3 points
  0 ∶ 0
  Parent
  
  I think the compute/network for the hash (going through literally all content) seems large, possibly multiple orders of magnitude more than the cost implied here.
  Yeah, they say[1] they have over 100PB content. That is quite a bit, and if it’s not in an inhouse datacenter, going through it will be expensive.
  [1] https://archive.devcon.org/archive/watch/6/universal-access-to-all-knowledge-decentralization-experiments-at-the-internet-archive
  If 20% of content remain not timestamped, do issues about credibility of content remain?
  If 20% of content remain not timestamped, then one wouldn’t consider all non-timestamped content suspicious on that account alone. The benefits come around in other ways:
  - If 80% of content is timestamped, then all that content is protected from suspicion that newer AI might have created it.
  - If the internet archive is known to have timestamped all of their content, then non-timestamped content presumably from an old enough version of a web site that is in the archive, becomes suspicious.
  - One might still consider non-timestamped content suspicious in a future where AI and/or institutional decline has begun nagging on the prior (default, average, general) trust for all content.
  There’s probably content with many tiny variations and it’s better to group that content together? … Finding the algorithm/implementation to do this seems important but also orders of magnitude more costly?
  It might be important, but it’s probably not as urgent. Timestamping has to happen at the time you want to have the timestamp for. Investigating and convincing people about what different pieces of content are equivalent from some inexact (or exact but higher-level) point of view, can be done later. I imagine that this is one possible future application for which these timestamps will be valuable. Applications such as these, I would probably put out of scope though.
John Salter Jan 30, 2023, 5:35 PM
6 points
3 ∶ 0

This is how all intervention suggestions should be written.

Clear. Concise. Concrete.

Kudos!
- Axel Svensson Jan 30, 2023, 7:35 PM
  2 points
  0 ∶ 0
  Parent
  
  Thank you! Feels great to get such response at my first post.
NunoSempere Jan 30, 2023, 12:38 PM
5 points
1 ∶ 0

At current HDD prices in the 7.50-14.50 USD/TB range [2,3], that is a few hundred bucks. Add storage redundancy, labor, continuously adding timestamps and serving proofs etc., and we’re still talking about a 5-digit amount yearly.
So five digits is what, $10k to $100k. Say $30k. I mean, this seems reasonable. Consider applying to the EA Infrastructure/Longtermism fund.
- Jeff Kaufman 🔸Jan 30, 2023, 1:59 PM
  13 points
  5 ∶ 0
  Parent
  
  I think it would be hard to do this without at least one full-time engineer, at least at first, at a fully loaded cost of $130-250k/y.
Douglas Knight Feb 1, 2023, 5:06 PM
2 points
0 ∶ 0

Another trusted party signing data is mail providers (DKIM), in particular mail sent through Google is signed. Google can’t repudiate these signatures, but you have to trust them not to write new history. Matthew Green calls for the opposite: for Google to publish its old private keys to destroy this information.
- Axel Svensson Feb 1, 2023, 7:57 PM
  2 points
  1 ∶ 0
  Parent
  
  
  … mail sent through Google is signed. Google can’t repudiate these signatures, but you have to trust them not to write new history. Matthew Green calls for the opposite: for Google to publish its old private keys to destroy this information.
  
  Interesting take on the dangers of strong validation. I note that time-stamping the signatures would prevent Google both from writing new history, and from doing what Mr Green wants.
  
  I haven’t taken the time to consider whether Mr Green’s point is valid, but i instinctively hope it isn’t because of what it would mean for the value of aiding truth-seeking.
philh Jan 31, 2023, 12:19 AM
1 point
0 ∶ 0

So I think there’s a couple levels this could be at.
There’s “it’s easy for someone to publish a thing and prove it was published before $time”. Honestly that’s pretty easy already, depending how much you have a site you can publish to and trust not to start backdating things in future (e.g. Twitter, Reddit, LW). Making it marginally lower friction/marginally more trustless (blockchain) would be marginally good, and I think cheap and easy.
(e: actually LW wouldn’t be good for that because I don’t think you can see last-edited timestamps there.)
But by itself it seems not that helpful because most people don’t do it. So if someone in ten years shows me a video and says it’s from today, it’s not weird that they can’t prove it.
If we could get it to a point where lots of people start timestamping, that would be an improvement. Then it might be weird if someone in the future can’t prove something was from today. And the thing Plex said in comments about being able to train on non-AI generated things becomes more feasible. But this is more a social than technical problem.
But I think what you’re talking about here is doing this for all public content, whether the author knows or not. And that seems neat, but… the big problem I see here is that a lot of things get edited after publishing. So if I edit a comment on Reddit, either we somehow pick up on that and it gets re-timestamped, or we lose the ability to verify edited comments. And if imgur decides to recompress old files (AI comes up with a new compression mechanism that gives us ¹⁄₄ the size of jpg with no visible loss of quality), everything on imgur can no longer be verified, at least not to before the recompression.
So there’s an empirical question of how often happens, and maybe the answer is “not much”. But it seems like something that even if it’s rare, the few cases where it does happen could potentially be enough to lose most of the value? Like, even if imgur doesn’t recompress their files, maybe some other file host has done, and you can just tell me it was hosted there.
There’s a related question of how you distinguish content from metadata: if you timestamp my blog for me, you want to pick up the contents of the individual posts but not my blog theme, which I might change even if I don’t edit the posts. Certainly not any ads that will change on every page load. I can think of two near-solutions for my blog specifically:
- I have an RSS feed. But I write in markdown which gets rendered to HTML for the feed. If the markdown renderer changes, bad luck. I suppose stripping out all the HTML tags and just keeping the text might be fine?
- To some extent this is a problem already solved by e.g. firefox reader mode, which tries to automatically extract and normalize the content from a page. But I don’t by default expect a good content extractor today to be a good content extractor in ten years. (E.g. people find ways to make their ads look like content to reader mode, so reader mode updates to avoid those.) So you’re hoping that a different content-extraction tool, applied to the same content in a different wrapper, extracts the exact same result.
This problem goes away if you’re also hosting copies of everything, but that’s no longer cheap. At that point I think you’re back to “addition to the internet archive” discussed in other comments; you’re only really defending against the internet archive going rogue (though this still seems valuale), and there’s a lot that they don’t capture.
Still. I’d be interested to see someone do this and then in a year go back and check how many hashes can be recreated. And I’d also be interested in the “make it marginally easier for people to do this themselves” thing; perhaps combine a best-effort scan of the public internet, with a way for people to add their own content (which may be private), plus some kind of standard for people to point the robots at something they expect to be stable. (I could implement “RSS feed but without rendering the markdown” for my blog.) Could implement optional email alerts for “hey all your content changed when we rescanned it, did you goof?”
- Axel Svensson Jan 31, 2023, 9:45 AM
  1 point
  0 ∶ 0
  Parent
  
  There’s “it’s easy for someone to publish a thing and prove it was published before $time”. Honestly that’s pretty easy already … marginally more trustless (blockchain) would be marginally good, and I think cheap and easy.
  More trustless is the main point. The marginal value could grow over time, or depending on your situation/jurisdiction/trust, be larger already for some people than others. Perhaps there are already certain countries where journalists aren’t supposed to trust institutions in other certain countries?
  But I think what you’re talking about here is doing this for all public content, whether the author knows or not. … the big problem I see here is that a lot of things get edited after publishing … everything on imgur can no longer be verified, at least not to before the recompression … There’s a related question of how you distinguish content from metadata
  If you change it, then the new thing is different from the old thing, and the new thing did not exist before it was created. If you change compression method, blog theme, spelling or some other irrelevant aspect, you can start from the archived old version, prove that it was created before a certain point in time, then try to convince me that the difference between old and new is irrelevant. If I agree, you will have proven to me that this new thing was effectively created before a certain time. If not, at least you have proven that the old version was.
  I do not propose trying to centrally solve the question of what changes are relevant, or distinguishing content from metadata, because that is up to interpretation.
  This problem goes away if you’re also hosting copies of everything, but that’s no longer cheap. At that point I think you’re back to “addition to the internet archive” discussed in other comments
  Yes, I do admit that an archive is necessary for this to be valuable. I would prefer to execute this by cooperating very, very closely with the internet archive or another big archive for a first project.
  you’re only really defending against the internet archive going rogue (though this still seems valuale), and there’s a lot that they don’t capture.
  Yeah, more or less. I think more. We are also defending against other kinds of current and future compromise including hacking of a well-meaning organization, government overreach, and people’s/organizations’/journalists’ unfounded distrust in the internet archive. Organizations change, and the majority of the value proposal is a long-term one.
  - philh Jan 31, 2023, 1:07 PM
    1 point
    0 ∶ 0
    Parent
    
    I think having to rely on an archive makes this a lot less valuable. If I find something in 2033 and want to prove it existed in 2023, I think that’s going to be much harder if I have to rely on the thing itself being archived in 2023, in an archive that still exists in 2033; compared to just relying on the thing being timestamped in 2023.
    I also think if you’re relying on the Internet Archive, the argument that this is urgent becomes weaker. (And honestly I didn’t find it compelling to begin with, though not for legible reasons I could point at.) Consider three possibilities, for something that will be created in May 2023:
    We can prove it was created no later than May 2023.
    We can prove it was created no later than June 2023.
    We can prove it was created no later than June 2023; and that in June 2023, the Internet Archive claimed it was created no later than May 2023.
    A one-month delay brings you from (1) to (2) if the IA isn’t involved. But if they are, it brings you from (1) to (3). As long as you set it up before IA goes rogue, the cost of delay is lower.
    - Axel Svensson Jan 31, 2023, 2:43 PM
      1 point
      0 ∶ 0
      Parent
      
      If I find something in 2033 and want to prove it existed in 2023, I think that’s going to be much harder if I have to rely on the thing itself being archived in 2023, in an archive that still exists in 2033; compared to just relying on the thing being timestamped in 2023.
      Yeah, I think this is an unfortunate technical necessity, but only in the case where the thing you find in 2033 has been changed (in an irrelevant way). If you find something in 2033 that was actually timestamped in 2023, you don’t need access to an archived version, since it’s identical to what you already have.
      I also think if you’re relying on the Internet Archive, the argument that this is urgent becomes weaker … As long as you set it up before IA goes rogue, the cost of delay is lower.
      This is fair criticism. IA does in fact timestamp content and has for a good while, just not trustlessly (at least not intentionally AFAIK). So, to the extent (in jurisdiction and time) that people in general can trust IA, including their intentions, competence, technology and government, perhaps the value really is marginal at the present time.
      Perhaps I will slightly decrease my belief about the urgency, although I remain convinced this is worth doing. I see the value as mostly long-term, and IA’s claims for what content was created when, is itself a treasure of arguably reliable recordkeeping worth protecting by timestamping.
Matt Goodman Jan 30, 2023, 2:53 PM
1 point
0 ∶ 0

Are you planning to back-date each piece of content that you timestamp to the time it was created? If so, how hard is it to find the time of creation of pieces at the moment? This seems to be the very problem this initiative is planning to tackle (so I’m guessing it’s at least somewhat hard) although I think the argument here is ‘it will get harder in the future’.
The only alternative I can see is to add the timestamp of the time that the content is processed as part of this initiative. This might be easier than finding the creation date but it would probably contain less useful information, because a bunch of content would all be timestamped around the time this initiative got started. And it might even be misleading, if people saw the timestamp and thought this was the time the content was actually created.
- Axel Svensson Jan 30, 2023, 7:32 PM
  4 points
  1 ∶ 0
  Parent
  
  Let me clarify the cryptography involved:
  
  There is cryptographic signing, that lets Alice sign a statement X so that Bob is able to cryptographically verify that Alice claims X. X could for example be “Content Y was created in 2023”. This signature is evidence for X only to the extent that Bob trusts Alice. This is NOT what I suggest we use, at least not primarily.
  
  There is cryptographic time-stamping, that lets Alice timestamp content X at time T so that Bob is able to cryptographically verify that content X existed before time T. Bob does not need to trust Alice, or anyone else at all, for this to work. This is what I suggest we use.
  
  Back-dating content is therefore cryptographically impossible when using cryptographic time-stamping. That is kind of the point; otherwise I wouldn’t be convinced that the value of the timestamps would grow over time. To the extent we use cryptographic time-stamping, the argument here is ‘it will be entirely impossible in the future’.
  
  However, cryptographic time-stamping and cryptographic signing can be combined in interesting ways:
  1. We could sign first and then timestamp, achieving a cryptographic proof that in or before 2023, archive.org claimed that content X was created in 1987. This might be valuable if the organization or its cryptographic key at a later date were to be compromised, e.g. by corruption, hacking, or government overreach. Timestamps created after an organization is compromised can still be trusted: You can always know the content was created in or before 2023, even if you have reason to doubt a claim made at that time.
  2. We could timestamp, then sign, then timestamp. This allows anyone to cryptographically verify that e.g. sometime between 2023-01-20 and 2023-01-30, Alice claimed that content X was created in 1987. This could be valuable if we later learn we have reason to distrust the organization before a certain date. Again, we will always know X was created before 2023-01-30, no matter anyone’s trustworthiness.
  As for the issue with 2023 timestamps being misleading for 1995 content: This issue is probably very real, but it’s less urgent. Making the timestamps is urgent. On top of that underlying data and cryptographic proofs, different UIs can be built and improved over time.