I imagine that scientists will soon have the ability to be unusually transparent and provide incredibly low rates of fraud/bias, using AI. (This assumes strong AI progress in the next 5-20 years)
AI auditors could track everything (starting with some key things) done for an experiment, then flag if there was significant evidence of deception / stats gaming / etc. For example, maybe a scientist has an AI screen-recording their screen whenever it’s on, but able to preserve necessary privacy and throw out the irrelevant data.
AI auditors could review any experimental setups, software, and statistics, and flag if it can detect any errors or not.
Over time, AI systems will be able to do large parts of the scientific work. We can likely make guarantees of AI-done-science that we can’t with humans.
Such systems could hypothetically provide significantly stronger assurances than those argued for by some of the scientific reform communities today (the Open Science movement, for example).
I’ve been interested in this for some of QURI’s work, and would love to see AI-overseen experimentation be done in the AI safety world.
Perhaps most important, this could all be good experimentation on our way to systems that will oversee key AI progress. I ultimately want AI auditors for all risky AI development, but some of that will be a harder sell.
Agreed with this. I’m very optimistic about AI solving a lot of incentive problems in science. I don’t know if the end case (full audits) as you mention will happen, but I am very confident we will move in a better direction than where we are now.
I’m working on some software now that will help a bit in this direction!
As a working scientist, I strongly doubt that any of this will happen.
First, existing AI’s are nowhere near being able to do any of the things with an accuracy that makes them particularly useful. AI’s are equipped to do things similar to their training set, but all science is on the frontier: it is a much harder task to figure out the correct experimental setup for something that has never been done before in the history of humanity.
Right now I’m finishing up an article about how my field acually uses AI, and it’s nothing like anything you proposed here: LLMs are used for BS grant applications and low-level coding, almost exclusively. I don’t find it very useful for anything else.
The bigger issue here is with the “auditors” themselves: who’s in charge of them? If a working scientist disagrees with what the “auditor” says, what happens? What happens if someone like Elon is in charge, and decides to use the auditors for a political crusade against “woke science”, as is currently literally happening right now?
Catching errors in science is not something that can be boiled down to a formula: a massive part of the process is socio-cultural. You push out AI auditors, people are just going to game them, like they have with p-values, etc. This is not a problem with a technological solution.
The bigger issue here is with the “auditors” themselves: who’s in charge of them? If a working scientist disagrees with what the “auditor” says, what happens? What happens if someone like Elon is in charge, and decides to use the auditors for a political crusade against “woke science”, as is currently literally happening right now?
I think this is a very sensible question.
My obvious answer is that the auditors should be held up to higher standards than the things they are auditing. This means that these should be particularly open, and should be open to other auditing. For example, the auditing code could be open-source, highly tested, and evaluated by both humans and AI systems.
I agree that there are ways one could do a poor job with auditing. I think this is generally true for most powerful tools we can bring in—we need to be sure to use it well, else it could do harm.
On your other points—it sounds like you have dramatically lower expectations for AI than I do or much of the AI safety community does. I agree that if you don’t think AI is very exciting, then AI-assisted auditing probably won’t go that far.
From my post: > this could all be good experimentation on our way to systems that will oversee key AI progress. I ultimately want AI auditors for all risky AI development, but some of that will be a harder sell.
If it’s the case that AI-auditors won’t work, then I assume we wouldn’t particularly need to oversee key AI progress anyway, as there’s not much to oversee.
My obvious answer is that the auditors should be held up to higher standards than the things they are auditing. This means that these should be particularly open, and should be open to other auditing. For example, the auditing code could be open-source, highly tested, and evaluated by both humans and AI systems.
Yeah, I just don’t buy that we could ever establish such a code in a way that would make it viable. Science chases novel projects and experiments, what is “meant” to happen will be different for each miniscule subfield of each field. If you release an open source code that has been proven to work for subfields A,B,C,D,E,F, someone in subfield G will immediately object that it’s not transferable, and they may very well be right. And the only people who can tell if it works on subfield G is people who are in subfield G.
You cannot avoid social and political aspects to this: Imagine if the AI-auditor code starts declaring that a controversial and widely used technique in, say, evolutionary psychology, is bad science. Does the evo-psych community accept this and abandon the technique, or do they say that the auditor code is flawed due to the biases of the code creators, and fork/reject the code? Essentially you are allowing whoever is controlling the auditor code to suppress fields they don’t agree with. It’s a centralization of science that is at odds with what allows science to actually work.
This strikes comment strikes me as so different to my view that I imagine you might be envisioning a very specific implementation of AI auditors that I’m not advocating for.
I tried having a discussion with an LLM about this to get some more insight, you can see this here if you like (though I suspect that you won’t wind this useful, as you seem to not trust LLMs much at all.) It wound up suggesting implementations that could still provide benefits while minimizing potential costs.
I don’t mind you using LLMs for elucidating discussion, although I don’t think asking it to rate arguments is very valuable.
The additional details of having subfield specific auditors that are opt-in does lessen my objections significantly. Of course, the issue of what counts as a subfield is kinda thorny. It would make most sense for, as claude suggests, journals to have an “auditor verified” badge, but then maybe you’re giving too much power over content to the journals, which usually stick to accept/reject decisions (and even that can get quite political).
Coming back to your original statement, ultimately I just don’t buy that any of this can lead to “incredibly low rates of fraud/bias”. If someone wants to do fraud or bias, they will just game the tools, or submit to journals with weak/nonexistent auditors. Perhaps the black box nature of AI might even make it easier to hide this kind of thing.
Next: there are large areas of science where a tool telling you the best techniques to use will never be particularly useful. On the one hand there is research like mine, where it’s so frontier that the “best practices” to put into such an auditor don’t exist yet. On the other, you have statistics stuff that is so well known that there already exist software tools that implement the best practices: you just have to load up a well documented R package. What does an AI auditor add to this?
If I was tasked with reducing bias and fraud, I would mainly push for data transparency requirements in journal publications, and in beefing up the incentives for careful peer review, which is currently unpaid and unrewarding labour. Perhaps AI tools could be useful in parts of that process, but I don’t see it as anywhere near as important than those other two things.
Looking back, I think this part of my first comment was poorly worded: > I imagine that scientists will soon have the ability to be unusually transparent and provide incredibly low rates of fraud/bias, using AI.
I meant > I imagine that scientists will [soon have the ability to] be unusually transparent and provide incredibly low rates of fraud/bias], using AI.
So it’s not that this will lead to low rates of fraud/bias, but that AI will help enable that for scientists willing to go along with it—but at the same time, there’s a separate question of if scientists are willing to go along with it.
But I think even that probably is not fair. A a better description of my beliefs is something like,
I think that LLM auditing tools could be useful for some kinds of scientific research for communities open to them.
I think in the short-term, sufficiently-motivated groups could develop these tools and use them to help decrease the levels of statistical and algorithmic accidents that happen. Correspondingly, I’d expect this to help with fraud.
In the long-run, whenever AI approaches human-level intelligence (which I think will likely happen in the next 20 years, but I realize others disagree), I expect that more and more of the scientific process will be automated. I think there are ways this could go very well using things like AI auditing, whereby the results will be much more reliable than those currently made by humans. There are of course also worlds in which humans do dumb things with the AIs and the opposite happens.
I think that at least, AI safety researchers should consider using these kinds of methods, and that the AI safety landscape should investigate efforts to make decent auditing tools.”
My core hope with the original message is to draw attention to AI science auditing tools as things that might be interesting/useful, not to claim that they’re definitely a major game changer.
I imagine that scientists will soon have the ability to be unusually transparent and provide incredibly low rates of fraud/bias, using AI. (This assumes strong AI progress in the next 5-20 years)
AI auditors could track everything (starting with some key things) done for an experiment, then flag if there was significant evidence of deception / stats gaming / etc. For example, maybe a scientist has an AI screen-recording their screen whenever it’s on, but able to preserve necessary privacy and throw out the irrelevant data.
AI auditors could review any experimental setups, software, and statistics, and flag if it can detect any errors or not.
Over time, AI systems will be able to do large parts of the scientific work. We can likely make guarantees of AI-done-science that we can’t with humans.
Such systems could hypothetically provide significantly stronger assurances than those argued for by some of the scientific reform communities today (the Open Science movement, for example).
I’ve been interested in this for some of QURI’s work, and would love to see AI-overseen experimentation be done in the AI safety world.
Perhaps most important, this could all be good experimentation on our way to systems that will oversee key AI progress. I ultimately want AI auditors for all risky AI development, but some of that will be a harder sell.
Agreed with this. I’m very optimistic about AI solving a lot of incentive problems in science. I don’t know if the end case (full audits) as you mention will happen, but I am very confident we will move in a better direction than where we are now.
I’m working on some software now that will help a bit in this direction!
As a working scientist, I strongly doubt that any of this will happen.
First, existing AI’s are nowhere near being able to do any of the things with an accuracy that makes them particularly useful. AI’s are equipped to do things similar to their training set, but all science is on the frontier: it is a much harder task to figure out the correct experimental setup for something that has never been done before in the history of humanity.
Right now I’m finishing up an article about how my field acually uses AI, and it’s nothing like anything you proposed here: LLMs are used for BS grant applications and low-level coding, almost exclusively. I don’t find it very useful for anything else.
The bigger issue here is with the “auditors” themselves: who’s in charge of them? If a working scientist disagrees with what the “auditor” says, what happens? What happens if someone like Elon is in charge, and decides to use the auditors for a political crusade against “woke science”, as is currently literally happening right now?
Catching errors in science is not something that can be boiled down to a formula: a massive part of the process is socio-cultural. You push out AI auditors, people are just going to game them, like they have with p-values, etc. This is not a problem with a technological solution.
I think this is a very sensible question.
My obvious answer is that the auditors should be held up to higher standards than the things they are auditing. This means that these should be particularly open, and should be open to other auditing. For example, the auditing code could be open-source, highly tested, and evaluated by both humans and AI systems.
I agree that there are ways one could do a poor job with auditing. I think this is generally true for most powerful tools we can bring in—we need to be sure to use it well, else it could do harm.
On your other points—it sounds like you have dramatically lower expectations for AI than I do or much of the AI safety community does. I agree that if you don’t think AI is very exciting, then AI-assisted auditing probably won’t go that far.
From my post:
> this could all be good experimentation on our way to systems that will oversee key AI progress. I ultimately want AI auditors for all risky AI development, but some of that will be a harder sell.
If it’s the case that AI-auditors won’t work, then I assume we wouldn’t particularly need to oversee key AI progress anyway, as there’s not much to oversee.
Yeah, I just don’t buy that we could ever establish such a code in a way that would make it viable. Science chases novel projects and experiments, what is “meant” to happen will be different for each miniscule subfield of each field. If you release an open source code that has been proven to work for subfields A,B,C,D,E,F, someone in subfield G will immediately object that it’s not transferable, and they may very well be right. And the only people who can tell if it works on subfield G is people who are in subfield G.
You cannot avoid social and political aspects to this: Imagine if the AI-auditor code starts declaring that a controversial and widely used technique in, say, evolutionary psychology, is bad science. Does the evo-psych community accept this and abandon the technique, or do they say that the auditor code is flawed due to the biases of the code creators, and fork/reject the code? Essentially you are allowing whoever is controlling the auditor code to suppress fields they don’t agree with. It’s a centralization of science that is at odds with what allows science to actually work.
This strikes comment strikes me as so different to my view that I imagine you might be envisioning a very specific implementation of AI auditors that I’m not advocating for.
I tried having a discussion with an LLM about this to get some more insight, you can see this here if you like (though I suspect that you won’t wind this useful, as you seem to not trust LLMs much at all.) It wound up suggesting implementations that could still provide benefits while minimizing potential costs.
https://claude.ai/share/4943d5aa-ed91-4b3a-af39-bc4cde9b65ef
I don’t mind you using LLMs for elucidating discussion, although I don’t think asking it to rate arguments is very valuable.
The additional details of having subfield specific auditors that are opt-in does lessen my objections significantly. Of course, the issue of what counts as a subfield is kinda thorny. It would make most sense for, as claude suggests, journals to have an “auditor verified” badge, but then maybe you’re giving too much power over content to the journals, which usually stick to accept/reject decisions (and even that can get quite political).
Coming back to your original statement, ultimately I just don’t buy that any of this can lead to “incredibly low rates of fraud/bias”. If someone wants to do fraud or bias, they will just game the tools, or submit to journals with weak/nonexistent auditors. Perhaps the black box nature of AI might even make it easier to hide this kind of thing.
Next: there are large areas of science where a tool telling you the best techniques to use will never be particularly useful. On the one hand there is research like mine, where it’s so frontier that the “best practices” to put into such an auditor don’t exist yet. On the other, you have statistics stuff that is so well known that there already exist software tools that implement the best practices: you just have to load up a well documented R package. What does an AI auditor add to this?
If I was tasked with reducing bias and fraud, I would mainly push for data transparency requirements in journal publications, and in beefing up the incentives for careful peer review, which is currently unpaid and unrewarding labour. Perhaps AI tools could be useful in parts of that process, but I don’t see it as anywhere near as important than those other two things.
This context is useful, thanks.
Looking back, I think this part of my first comment was poorly worded:
> I imagine that scientists will soon have the ability to be unusually transparent and provide incredibly low rates of fraud/bias, using AI.
I meant
> I imagine that scientists will [soon have the ability to] be unusually transparent and provide incredibly low rates of fraud/bias], using AI.
So it’s not that this will lead to low rates of fraud/bias, but that AI will help enable that for scientists willing to go along with it—but at the same time, there’s a separate question of if scientists are willing to go along with it.
But I think even that probably is not fair. A a better description of my beliefs is something like,
I think that LLM auditing tools could be useful for some kinds of scientific research for communities open to them.
I think in the short-term, sufficiently-motivated groups could develop these tools and use them to help decrease the levels of statistical and algorithmic accidents that happen. Correspondingly, I’d expect this to help with fraud.
In the long-run, whenever AI approaches human-level intelligence (which I think will likely happen in the next 20 years, but I realize others disagree), I expect that more and more of the scientific process will be automated. I think there are ways this could go very well using things like AI auditing, whereby the results will be much more reliable than those currently made by humans. There are of course also worlds in which humans do dumb things with the AIs and the opposite happens.
I think that at least, AI safety researchers should consider using these kinds of methods, and that the AI safety landscape should investigate efforts to make decent auditing tools.”
My core hope with the original message is to draw attention to AI science auditing tools as things that might be interesting/useful, not to claim that they’re definitely a major game changer.