I’m a researcher at Forethought; before that, I ran the non-engineering side of the EA Forum (this platform), ran the EA Newsletter, and worked on some other content-related tasks at CEA. [More about the Forum/CEA Online job.]
Selected posts
Background
I finished my undergraduate studies with a double major in mathematics and comparative literature in 2021. I was a research fellow at Rethink Priorities in the summer of 2021 and was then hired by the Events Team at CEA. I later switched to the Online Team. In the past, I’ve also done some (math) research and worked at Canada/USA Mathcamp.
I think this is a good question, and it’s something I sort of wanted to look into and then didn’t get to! (If you’re interested, I might be able to connect you with someone/some folks who might know more, though.)
Quick general takes on what private companies might be able to do to make their tools more useful on this front (please note that I’m pretty out of my depth here, so take this with a decent amount of salt—and also this isn’t meant to be prioritized or exhaustive):
Some of the vetting/authorization processes (e.g. FedRAMP) are burdensome, and sometimes companies do just give up/don’t bother (see footnote 12), which narrows the options for agencies; going through this anyway could be very useful
Generally lowering costs for tech products can make a real difference for whether agencies will adopt them—also maybe open source products are likelier to be used(?) (and there’s probably stuff like which chips are available, which systems staff are used to, what tradeoffs on speed vs cost or similar make sense...)
Security is useful/important, e.g. the tool/system can be run locally, can be fine-tuned with sensitive data, etc. (Also I expect things like “we can basically prove that the training / behavior of this model satisfies [certain conditions]” will increasingly matter — with conditions like “not trained on X data” or “could not have been corrupted by any small group of people” — but my understanding/thinking here is very vague!)
Relatedly I think various properties of the systems’ scaffolding will matter; e.g. “how well the tool fits with existing and future systems” — so modularity and interoperability[1] (in general and with common systems) are very useful — and “can this tool be set up ~once but allow for different forms of access/data” (e.g. handle differences in who can see different kinds of data, what info should be logged, etc.)
(Note also there’s a pretty huge set of consultancies that focus on helping companies sell to the government, but the frame is quite different.)
And then in terms of ~market gaps, I’m again very unsure, but expect that (unsurprisingly) lower-budget agencies will be especially undersupplied — in particular the DOD has a lot more funding and capacity for this kind of thing — so building things for e.g. NIST could make sense. (Although it might be hard to figure out what would be particularly useful for agencies like NIST without actually being at NIST. I haven’t really thought about this!)
I haven’t looked into this at all, but given the prevalence of Microsoft systems (Azure etc.) in the US federal government (which afaik is greater than what we see in the UK), I wonder if Microsoft’s relationship with OpenAI explains why we have ChatGPT Gov in the US, while Anthropic is collaborating with the UK government https://www.anthropic.com/news/mou-uk-government