Yarrow Bouchard 🔸 comments on How Well Does RL Scale?

Yarrow Bouchard 🔸 15 Nov 2025 3:06 UTC
2 points
0 ∶ 0
Hello, Matt. Let me just say I really appreciate your friendly, supportive, and positive approach to this conversation. It’s very nice. Discussions on the EA Forum can get pretty sour sometimes, and I’m probably not entirely blameless in that myself.

You don’t have to reply if you don’t want, but I just wanted to follow up in case you still did.

Can you explain what you mean about the data efficiency of the new RL techniques in the papers you mentioned? You say it’s more complex, but that doesn’t help me understand.

By the way, did you use an LLM like Claude or ChatGPT to help write your comment? It has some of the hallmarks of LLM writing for me. I’m just saying this to help you — you may not realize how much LLMs’ writing style sticks out like a sore thumb (depending on how you use them) and it will likely discourage people from engaging with you if they detect that. I keep encouraging people to trust themselves as writers, trust their own voice, and reassuring them that the imperfections of their writing doesn’t make us, the readers, like it less, it makes us like it more.