I guess it depends on your priors or something. Itâs agentic relative to a rock, but, relative to an AI which can pass the LSAT, itâs well below my expectations. It seems like ARC-Evals had to coax and prod GPT-4 to get it to do things it âshouldâ have been doing with rudimentary levels of agency.
I guess it depends on your priors or something. Itâs agentic relative to a rock, but, relative to an AI which can pass the LSAT, itâs well below my expectations. It seems like ARC-Evals had to coax and prod GPT-4 to get it to do things it âshouldâ have been doing with rudimentary levels of agency.