There are different reference classes we might use for “reasonable” here. I believe that paying the salary just of the researchers involved to do the key work will usually be a good amount less (but maybe not if you’re having to compete with AI lab salaries?). But I think that that’s not very available on the open market (i.e. for funders, who aren’t putting in the management time), unless someone good happens to want to research this anyway. In the reference class of academic grants, this looks relatively normal.
It’s a bit hard from the outside to be second-guessing the funders’ decisions, since I don’t know what information they had available. The decisions would look better the more there was a good prototype or other reason to feel confident that they’d produce a strong benchmark. It might be that it would be optimal to investigate getting less thorough work done for less money, but it’s not obvious to me.
I guess this is all a roundabout way of saying “naively it seems on the high side to me, but I can totally imagine learning information such that it would seem very reasonable”.
Thanks for your thorough comment, Owen.
And do the amounts ($1M and $0.5M) seem reasonable to you?
As a point of reference, Epoch AI is hiring a “Project Lead, Mathematics Reasoning Benchmark”. This person will receive ~$100k for a 6-month contract.
There are different reference classes we might use for “reasonable” here. I believe that paying the salary just of the researchers involved to do the key work will usually be a good amount less (but maybe not if you’re having to compete with AI lab salaries?). But I think that that’s not very available on the open market (i.e. for funders, who aren’t putting in the management time), unless someone good happens to want to research this anyway. In the reference class of academic grants, this looks relatively normal.
It’s a bit hard from the outside to be second-guessing the funders’ decisions, since I don’t know what information they had available. The decisions would look better the more there was a good prototype or other reason to feel confident that they’d produce a strong benchmark. It might be that it would be optimal to investigate getting less thorough work done for less money, but it’s not obvious to me.
I guess this is all a roundabout way of saying “naively it seems on the high side to me, but I can totally imagine learning information such that it would seem very reasonable”.