James Fodor comments on Benchmark Performance is a Poor Measure of Generalisable AI Reasoning Capabilities