Davidmanheim comments on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines.

Davidmanheim 16 Aug 2022 15:09 UTC
21 points
0 ∶ 0
I’m entirely unconvinced that this is a relevant concern—if training data is the equivalent of complex environments, we kind-of get it for free, and even where we don’t, we can simulate natural environments and other agents much more cheaply than nature.
- NunoSempere 16 Aug 2022 15:16 UTC
  17 points
  0 ∶ 0
  Parent
  if training data is the equivalent of complex environments, we kind-of get it for free
  Don’t disagree
  we can simulate natural environments and other agents much more cheaply than nature
  Also don’t disagree, but this is a matter of degree, no? For example, I’m thinking that having an enviroment with many agents acting on each other and on the environment would make the training process less paralelizable.
  Personally I found it pretty hard to give a number to “the least complex environment which could give rise to intelligent life”; if you have thoughts on how to bound this I’d be keen to hear them.
  - Davidmanheim 17 Aug 2022 7:10 UTC
    4 points
    0 ∶ 0
    Parent
    That makes sense, and I think we’re mostly agreeing—it just seemed like you were skipping this entirely in your explanation.
- Michael St Jules 🔸 16 Aug 2022 15:49 UTC
  6 points
  0 ∶ 0
  Parent
  It might be very costly, perhaps impractically costly, to collect training data that can make up for the responsiveness of a simulated environment to the choices an agent makes. An agent can actively test and explore their environment in a way collected training data won’t allow them to do flexibly without possibly impractical amounts of it. You’d need to anticipate how the environment would respond, and you’d basically be filling the entries of a giant lookup table for the responses you anticipate.
  
  AlphaGo was originally pretrained to mimic experts, but extra performance came from simulated self-play, and the next version skipped the expert mimicking pretraining.
  
  It’s plausible the environments don’t need to be very complex or detailed, though, to the point that most of the operations are still in the AI.
  - Davidmanheim 17 Aug 2022 7:16 UTC
    2 points
    0 ∶ 0
    Parent
    You don’t necessarily need to collect training data, that’s why RL works. And simulating an environment is potentially cheap, as you noted. So again, I’m unconvinced that this is actually a problem with bio anchors, at least above and beyond what Cotra says in the report itself.