My system prompt is very short. About 3 lines to counteract sycophancy bias + hedging bias.
Claude also knows I’m in Berkeley, as another potential source of bias.
That said, I never bothered to figure out how to access it via the API but in the past my friend who did had approximately the same results as my incognito tests, on other questions of a similar flavor. The results with the Chinese models (which were on LM Arena, without context) also seem more consistent with the models having more EA-favored opinions on charities in general, at least when prompted approximately neutrally in English.
My system prompt is very short. About 3 lines to counteract sycophancy bias + hedging bias.
Claude also knows I’m in Berkeley, as another potential source of bias.
That said, I never bothered to figure out how to access it via the API but in the past my friend who did had approximately the same results as my incognito tests, on other questions of a similar flavor. The results with the Chinese models (which were on LM Arena, without context) also seem more consistent with the models having more EA-favored opinions on charities in general, at least when prompted approximately neutrally in English.