As you’ll notice in the introduction and section 5 of the paper, we do not claim that language agents offer any guarantees of safety. As our title suggests, our claim is rather that they reduce the risk of existential catastrophe.
Language agents are based on large language models. Do you think the problems you identify related to the “highly specific nature” of natural language processing systems apply to large language models? For example, do you think GPT-4 would be unable to understand different phrasings of a command to open a user’s email? If so, do you have any evidence that this is so? My own experience with GPT-4 strongly suggests otherwise.
Hello,
As you’ll notice in the introduction and section 5 of the paper, we do not claim that language agents offer any guarantees of safety. As our title suggests, our claim is rather that they reduce the risk of existential catastrophe.
Language agents are based on large language models. Do you think the problems you identify related to the “highly specific nature” of natural language processing systems apply to large language models? For example, do you think GPT-4 would be unable to understand different phrasings of a command to open a user’s email? If so, do you have any evidence that this is so? My own experience with GPT-4 strongly suggests otherwise.