I can speak only from the data angle, but I would add that directing focus toward the actual individuals performing RLHF & providing datasets (rather than calling this a pure “research problem”) is vital to getting this right
I can speak only from the data angle, but I would add that directing focus toward the actual individuals performing RLHF & providing datasets (rather than calling this a pure “research problem”) is vital to getting this right