AlexMennen comments on Principia Qualia: blueprint for a new cause area, consciousness research with an eye toward ethics and x-risk

AlexMennen 12 Dec 2016 4:54 UTC
1 point
0 ∶ 0

->3. I also think theories in IIT’s reference class won’t be correct, but I suspect I define the reference class much differently. :) Based on my categorization, I would object to lumping my theory into IIT’s reference class (we could talk more about this if you’d like).

I’m curious about this, since you mentioned fixing IIT’s flaws. I came to the comments to make the same complaint you were responding to Jessica about.
- CarlShulman 12 Dec 2016 7:01 UTC
  4 points
  0 ∶ 0
  Parent
  I had the same response. The document claims that pleasure or positive valence corresponds to symmetry.
  
  What people generally refer to when they speak of ‘happiness’ or ‘suffering’ - the morally significant hedonic status of a system- is the product of valenceintensityconsciousness, or the location within this combined state-space.
  
  This does not look like a metric that is tightly connected to sensory, cognitive, or behavioral features. In particular, it is not specifically connected to liking, wanting, aversion, and so forth. So, like IIT in the cases discussed by Scott Aaronson, it would seem likely to assign huge values (of valence rather than consciousness, in this case) to systems that lack the corresponding functions, and very low values to systems that possess them.
  
  The document is explicit about qualia not being strictly linked to the computational and behavioral functions that lead us to, e.g. talk about qualia or withdraw from painful stimuli:
  
  In short, our brain has evolved to be able to fairly accurately report its internal computational states (since it was adaptive to be able to coordinate such states with others), and these computational states are highly correlated with the microphysical states of the substrate the brain’s computations run on (the actual source of qualia). However, these computational states and microphysical states are not identical. Thus, we would need to be open to the possibility that certain interventions could cause a change in a system’s physical substrate (which generates its qualia) without causing a change in its computational level (which generates its qualia reports). We’ve evolved toward having our qualia, and our reports about our qualia, being synchronized- but in contexts where there hasn’t been an adaptive pressure to accurately report our qualia, we shouldn’t expect these to be synchronized ‘for free’.
  
  The falsifiable predictions are mostly claims that the computational functions will be (imperfectly) correlated with symmetry, but the treatment of boredom appears to allow that these will be quite imperfect:
  
  Why do we find pure order & symmetry boring, and not particularly beautiful? I posit boredom is a very sophisticated “anti-wireheading” technology which prevents the symmetry/pleasure attractor basin from being too ‘sticky’, and may be activated by an especially low rate of Reward Prediction Errors (RPEs). Musical features which add mathematical variations or imperfections to the structure of music—e.g., syncopated rhythms (Witek et al. 2014), vocal burrs, etc—seem to make music more addictive and allows us to find long-term pleasure in listening to it, by hacking the mechanic(s) by which the brain implements boredom.
  
  Overall, this seems systematically analogous to IIT in its flaws. If one wanted to pursue an analogy to Aaronson’s discussion of trivial expander graphs producing extreme super-consciousness, one could create an RL agent (perhaps in an artificial environment where it has the power to smile, seek out rewards, avoid injuries (which trigger negative reward), favor injured limbs, and consume painkillers (which stop injuries from generating negative reward) whose symmetry could be measured in whatever way the author would like to specify.
  
  I think we can say now that we could program the agent in such a way that it sought out things that resulted in either more or less symmetric states, or was neutral to such things. Likewise, switching the signs of rewards would not reliably switch the associated symmetry. And its symmetry could be directly and greatly altered without systematic matching behavioral changes.
  
  I would like to know whether the theory in PQ is supposed to predict that such agents couldn’t be built without extraordinary efforts, or that they would have systematic mismatch of their functional beliefs and behavior regarding qualia with actual qualia.
  - MikeJohnson 15 Dec 2016 4:57 UTC
    1 point
    0 ∶ 0
    Parent
    Hi Carl, thanks for your thoughts & time. I appreciate the comments.
    
    First, to be clear, the hypothesis is that the symmetry of the mathematical object isomorphic to a conscious experience corresponds to valence. This is distinct from (although related to) the symmetry of a stimulus, or even symmetry within brain networks.
    
    This does not look like a metric that is tightly connected to sensory, cognitive, or behavioral features. In particular, it is not specifically connected to liking, wanting, aversion, and so forth. So, like IIT in the cases discussed by Scott Aaronson, it would seem likely to assign huge values (of valence rather than consciousness, in this case) to systems that lack the corresponding functions, and very low values to systems that possess them.
    
    I strongly disagree with this in the case of humans, fairly strongly disagree in the more general case of evolved systems, and mildly disagree in the fully general case of arbitrary systems.
    
    First, it seems extremely like to me that evolved organisms would use symmetry as an organizational principle / attractor (Section XII);
    
    Second, in cases where we do have some relevant data or plausible models (I.e., as noted in Sections IX and XII), the symmetry hypothesis seems plausible. I think the hypothesis does really well when one actually looks at the object-level, particularly e.g., Safron’s model of orgasm & Seth and Friston’s model of interoception;
    
    Third, with respect to extending Aaronson’s critique, I question whether “this seems to give weird results when put in novel contexts” is a good path to take. As Eric Schwitzgebel notes, “Common sense is incoherent in matters of metaphysics. There’s no way to develop an ambitious, broad-ranging, self- consistent metaphysical system without doing serious violence to common sense somewhere. It’s just impossible. Since common sense is an inconsistent system, you can’t respect it all. Every metaphysician will have to violate it somewhere.” This seems particularly true in the realm of consciousness, and particularly true in contexts where there was no evolutionary benefit in having correct intuitions.
    
    As such it seems important not to enshrine common sense, with all its inconsistencies, as the gold standard with regard to valence research. In general, I’d say a good sign of a terrible model of consciousness would be that it validates all of our common-sense intuitions about the topic.
    
    The falsifiable predictions are mostly claims that the computational functions will be (imperfectly) correlated with symmetry, but the treatment of boredom appears to allow that these will be quite imperfect:
    
    Section XI is intended as the core set of falsifiable predictions—you may be thinking of the ‘implications for neuroscience’ discussion in Section XII, some of which could be extended to become falsifiable predictions.
    
    Overall, this seems systematically analogous to IIT in its flaws. If one wanted to pursue an analogy to Aaronson’s discussion of trivial expander graphs producing extreme super-consciousness, one could create an RL agent (perhaps in an artificial environment where it has the power to smile, seek out rewards, avoid injuries (which trigger negative reward), favor injured limbs, and consume painkillers (which stop injuries from generating negative reward) whose symmetry could be measured in whatever way the author would like to specify. I think we can say now that we could program the agent in such a way that it sought out things that resulted in either more or less symmetric states, or was neutral to such things. Likewise, switching the signs of rewards would not reliably switch the associated symmetry. And its symmetry could be directly and greatly altered without systematic matching behavioral changes. I would like to know whether the theory in PQ is supposed to predict that such agents couldn’t be built without extraordinary efforts, or that they would have systematic mismatch of their functional beliefs and behavior regarding qualia with actual qualia.
    
    I’d assert- very strongly- that one could not evolve such a suffering-seeking agent without extraordinary effort, and that if one was to attempt to build one from scratch, it would be orders of magnitude more difficult to do so than making a “normal” agent. (This follows from my reasoning in Section XII.) But let’s keep in mind that whether the agent you’re speaking of is a computational program or a physical system matters a lot—under my model, a RL agent running on a standard Von Neumann physical architecture probably has small & merely fragmentary qualia.
    
    An analogy here would be the orthogonality thesis: perhaps we can call this “valence orthogonality”: the behavior of a system, and its valence, are usually tightly linked via evolutionary processes and optimization factors, but they are not directly causally coupled, just as intelligence & goals are not causally coupled.
    
    This hypothesis does also have implications for the qualia of whole-brain emulations, which perhaps is closer to your thought-experiment.
- MikeJohnson 15 Dec 2016 3:59 UTC
  0 points
  0 ∶ 0
  Parent
  As I understand their position, MIRI tends to not like IIT because it’s insufficiently functionalist—and too physicalist. On the other hand, I don’t think IIT could be correct because it’s too functionalist—and insufficiently physicalist, partially for the reasons I explain in my response to Jessica.
  
  The core approach I’ve taken is to enumerate the sorts of problems one would need to solve if one was to formalize consciousness. (Whether consciousness is a thing-that-can-be-formalized is another question, of course.) My analysis is that IIT satisfactorily addresses 4 or 5, out of the 8 problems. Moving to a more physical basis would address more of these problems, though not all (a big topic in PQ is how to interpret IIT-like output, which is an independent task of how to generate it).
  
  Other research along these same lines would be e.g.,
  
  ->Adam Barrett’s FIIH: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3912322/
  
  ->Max Tegmark’s Perceptronium: https://arxiv.org/abs/1401.1219