Charles He comments on Winners of the EA Criticism and Red Teaming Contest

Charles He 1 Oct 2022 8:10 UTC
2 points
0 ∶ 0
Nested inside of the above issue, another problem is that the author seems to use “proof-like” rhetoric in arguments, when she needs to provide broader illustrations that could generalize for intuition, because the proof actually isn’t there.
Sometimes some statements don’t seem to resemble how people use mathematical argumentation in disciplines like machine learning or economics.
To explain, the author begins with an excellent point that it’s bizarre and basically statistically impossible that a feed forward network can learn to do certain things through limited training, even though the actual execution in the model would be simple.
One example is that it can’t learn the mechanics of addition for numbers larger than it has seen computed in training.

Basically, the most “well trained”/largest feed forward DNN that uses backprop training, will never add 99+1 correctly, if it was only trained on adding smaller numbers like 12+17 if these calculations never total 100. This is because in backprop, the network literally needs to see and create processes for the 100 digits. This is despite the fact that it’s simple (for a vast DNN) to “mechanically have” the capability to perform true logical addition.
Immediately starting from the above point, I think author wants to suggests that, in the same way it’s impossible to get this functionality above, this constrains what feed forward networks would do (and these ideas should apply to deep learning or 2020 technology for biological anchors).
However, everything sort of changes here. The author says:
I’s not clear what is being claimed or what is being built on above.
- What computations are foreclosed or what can’t be achieved in feed forward nets?
  - While the author shows that addition with n+1 digits can’t be achieved by training with addition with numbers with n digits”, and certainly many other training to outcomes are prevented, why would this generally rule out capability, and why would this stop other (maybe very sophisticated) training strategies/simulations from producing models that could be dangerous?
- The author says the “upshot is that the class of solutions searched over by feedforward networks in practice seems to be (approximately) the space of linear models with all possible features” and “this is a big step up from earlier ML algorithms where one has to hand-engineer the features”.
  - But that seems to allow general transformations on the features. If so, that is incredibly powerful. It doesn’t seem to constrain functionality (of these feed forward networks)?
- Why would the logic which relies on a technical proof (which I am guessing relies on a “topological-like” argument that requires the smooth structure of feed forward neural nets), apply to even to RNN or LTSM, or transformers?
- jylin04 1 Oct 2022 9:53 UTC
  5 points
  0 ∶ 0
  Parent
  Regarding the questions about feedforward networks, a really short answer is that regression is a very limited form of inference-time computation that e.g. rules out using memory. (Of course, as you point out, this doesn’t apply to other 2020 algorithms beyond MLPs.) Sorry about the lack of clarity—I didn’t want to take up too much space in this piece going into the details of the linked papers, but hopefully I’ll be able to do a better job explaining it in a review of those papers that I’ll post on LW/AF next week.
  (I also want to reply to your top-level comments about the evolutionary anchor, but am a bit short on time to do it right now (since for those questions I don’t have cached technical answers and will have to remind myself about the context). But I’ll definitely get to it next week.)
  - Charles He 2 Oct 2022 1:45 UTC
    10 points
    0 ∶ 0
    Parent
    Thanks for the responses, they give a lot more useful context.
    (I also want to reply to your top-level comments about the evolutionary anchor, but am a bit short on time to do it right now (since for those questions I don’t have cached technical answers and will have to remind myself about the context). But I’ll definitely get to it next week.)
    
    If it frees up your time, I don’t think you need to write the above, unless you specifically want to. It seems reasonable to interpret that point on “evolutionary anchors” as a larger difference on the premise, and that is not fully in scope of the post. This difference and its phrasing is more disagreeable/overbearing to answer, so it’s also less worthy of a response.
    Thanks for writing your ideas.