My only sadness here is that I get the impression you think this work is kind of a dead-end?
I don’t think it is a dead end.
As I say in the post:
ARC-AGI probably isn’t a good benchmark for evaluating progress towards TAI: substantial “elicitation” effort could massively improve performance on ARC-AGI in a way that might not transfer to more important and realistic tasks.
But, I still think that work like ARC-AGI can be good on the margin for getting a better understanding of current AI capabilities.
I don’t think it is a dead end.
As I say in the post:
ARC-AGI probably isn’t a good benchmark for evaluating progress towards TAI: substantial “elicitation” effort could massively improve performance on ARC-AGI in a way that might not transfer to more important and realistic tasks.
But, I still think that work like ARC-AGI can be good on the margin for getting a better understanding of current AI capabilities.