Making LLMs public is dangerous because by publishing the weights you allow others to easily remove safeguards.
Once you remove the safeguards, current LLMs are already helpful in getting at the key information necessary to cause a pandemic.
I like this way of splitting it up. I think the paper made a good case for point 1, but I think point 2 is greatly overstated. With current tech you would still need an expert to sift through hallucinations and to guide the LLM, and the same expert could do the same thing without the LLM. On this issue current LLM’s are timesavers, not gamechangers.
For this reason I doubt you can convince people to hide their weight now, but possibly you can convince them to do so later, when the tech is improved enough to be dangerous.
possibly you can convince them to do so later, when the tech is improved enough to be dangerous
Sort of: because once you publish the weights for a model there’s no going back I’m hoping even the next round of models will not be published, or at least not published without a thorough set of evals. The problem is that if you miss that a private model is able to meaningfully lower the bar to causing harm (ex: telling people how to make pandemics) you can restrict access or modify it, while you learn that a public model can do that you’re out of luck.
I like this way of splitting it up. I think the paper made a good case for point 1, but I think point 2 is greatly overstated. With current tech you would still need an expert to sift through hallucinations and to guide the LLM, and the same expert could do the same thing without the LLM. On this issue current LLM’s are timesavers, not gamechangers.
For this reason I doubt you can convince people to hide their weight now, but possibly you can convince them to do so later, when the tech is improved enough to be dangerous.
Sort of: because once you publish the weights for a model there’s no going back I’m hoping even the next round of models will not be published, or at least not published without a thorough set of evals. The problem is that if you miss that a private model is able to meaningfully lower the bar to causing harm (ex: telling people how to make pandemics) you can restrict access or modify it, while you learn that a public model can do that you’re out of luck.