1. I’m not sure how closely at the technical level this resembles exactly what the companies do. We did base this on the standard Inspect Framework to be widely usable, and looked at other Inspect evals and benchmarks/datasets (e.g. HH-RLHF) for inspiration. When discussing at a high level with some people from the companies, this seemed like something resembling what they could use but again, I’m not sure about the more technical details
2. Thanks for the recommendation, makes sense. We did think about comms somewhat e.g. to convey intuition for someone skimming that “higher is better” in the paper (https://arxiv.org/pdf/2503.04804) we first present results with different species (Figure 2). Could probably use colours and other design elements to improve the presentation.
Thanks, and very good question+comment!
1. I’m not sure how closely at the technical level this resembles exactly what the companies do. We did base this on the standard Inspect Framework to be widely usable, and looked at other Inspect evals and benchmarks/datasets (e.g. HH-RLHF) for inspiration. When discussing at a high level with some people from the companies, this seemed like something resembling what they could use but again, I’m not sure about the more technical details
2. Thanks for the recommendation, makes sense. We did think about comms somewhat e.g. to convey intuition for someone skimming that “higher is better” in the paper (https://arxiv.org/pdf/2503.04804) we first present results with different species (Figure 2). Could probably use colours and other design elements to improve the presentation.