I follow Crocker’s rules.
niplav
I find this comment much more convincing than the top-level post.
I would very much prefer it if one didn’t appeal to the consequences of the belief about animal moral patienthood, and instead argue whether animals in fact are moral patients or not, or whether the question is well-posed.
For this reason, I have strong-downvoted your comment.
Consider myself more culturally rationalist than EA, my (short) answer above. The real answer is 10k words and probably not worth the effort per insight/importance.
The Rationality Community has a far lower focus on morality, and has members which are amoral or completely selfish. I’ll go out on a limb and claim that they also have a broader set of interests, since there is less of a restriction on what attention can be focused on (EA wants to do good, the rationality community is interested in truth, and truth can be found basically about anything).
I don’t remember getting this from you, but maybe you mentioned it on 𝕏. I actually had to look up the difference, and hope I got it right.
Two Reasons For Restarting the Testing of Nuclear Weapons
We have three questions about the plan of using AI systems to align more capable AI systems.
-
What form should your research output take if things go right? Specifically, what type of output would want your automated alignment researcher to produce in the search for a solution to the alignment problem? Is the plan to generate formal proofs, sets of heuristics, algorithms with explanations in natural language or something else?
-
How would you verify that your automated alignment researcher is sufficiently aligned? What’s counts as evidence and what doesn’t? Related to the question above, how can one evaluate the output of this automated alignment researcher? This could range from a proof with formal guarantees to a natural language description of a technique together with a convincing explanation. As an example, the Underhanded C Contest is a setup in which malicious outputs can be produced and not detected, or if they are detected there is high plausible deniability of there being an honest mistake.
-
Are you planning to find a way of aligning arbitrarily powerful superintelligences, or are you planning to align AI systems that are slightly more powerful than the automated alignment researcher? In the second case, what degree of alignment do you think is sufficient? Would you expect that alignment that is not very close to 100% to become a problem with iterating this approach, similar to instability in numerical analysis?
-
Yes, writing good stuff is hard.
It takes a lot of time, and is inadequately rewarded. Some people who write long-form stuff are exceedingly smart, so it’s easier for them. Which is why easier-to-write & at-best-shallowly-researched stuff is the norm.
This sequence is spam and should be deleted.
This is a post that rings in my heart, thank you so much. I think people very often conflate these concepts, and I also think we’re in a very complex space here (especially if you look at Bayesianism from below and see that it grinds on reality/boundedness pretty hard and produces some awful screeching sounds while applied in the world).
I agree that these concepts you’ve presented are at least antidotes to common confusions about forecasts, and I have some more things to say.
I feel confused about credal resilience:
The example you state appears correct, but I don’t know how that would look like as a mathematical object. Some people have talked about probability distributions on probability distributions, in the case of a binary forecast that would be a function , which is…weird. Do I need to tack on the resilience to the distribution? Do I compute it out of the probability distribution on probability distributions? Perhaps the people talking about imprecise probabilities/infrabayesianism are onto something when they talk about convex sets of probability distributions as the correct objects instead of probability distributions per se.
One can note that AIXR is definitely falsifiable, the hard part is falsifying it and staying alive.
There will be a state of the world confirming or denying the outcome, there’s just a correlation between our ability to observe those outcomes and the outcomes themselves.
Knightian uncertainty makes more sense in some restricted scenarios especially related to self-confirming/self-denying predictions. If one can read the brain state of a human and construct their predictions of the environment out of that, then one can construct an environment where the human has Knightian uncertainty by constructing outcomes that the human assigned the smallest probability to. (Even a uniform belief gets fooled: We’ll pick one option and make that happen many times in a row, but as soon as our poor subject starts predicting that outcome we shift to the ones less likely in their belief).
It need not be such a fanciful scenario: It could be that my buddy James made a very strong prediction that he will finish cleaning his car by noon, so he is too confident and procrastinates until the bell tolls for him. (Or the other way around, where his high confidence makes him more likely to finish the cleaning early, in that case we’d call it Knightian certainty).
This is a very different case than the one people normally state when talking about Knightian uncertainty, but an (imho) much more defensible one. I agree that the common reasons named for Knightian uncertainty are bad.
Another common complaint I’ve heard is about forecasts with very wide distributions, a case which evoked especially strong reactions was the Cotra bio-anchors report with (iirc) non-negligible probabilities on 12 orders of magnitude. Some people apparently consider such models worse than useless, harkening back to forecast legibility. Apparently both very wide and very narrow distributions are socially punished, even though having a bad model allows for updating & refinement.
Another point touched on very shortly in the post is on forecast precision. We usually don’t report forecasts with six or seven digits of precision, because at that level our forecasts are basically noise. But I believe that some of the common objections are about (perceived) undue precision; someone who reports 7 digits of precision is scammy, so reporting about 2 digits of precision is… fishy. Perhaps. I know there’s a Tetlock paper on the value of precision in geopolitical forecasting, but it uses a method of rounding to probabilities instead of odds or log-odds. (Approaches based on noising probabilities and then tracking score development do not work—I’m not sure why, though.). It would be cool to know more about precision in forecasting and how it relates to other dimensions.
I think that also probabilities reported by humans are weird because we do not have the entire space of hypothesis in our mind at once, and instead can shift our probabilities during reflection (without receiving evidence). This can apply as well to different people: If I believe that X has a very good reasoning process based on observations on Xs past reasoning, I might not want to/have to follow Xs entire train of thought before raising my probability of their conclusion.
Sorry about the long comment without any links, I’m currently writing this offline and don’t have my text notes file with me. I can supply more links if that sounds interesting/relevant.
Note: The “Malaria killed ~50 bio. people ever” factoid is likely incorrect.
This one looks cool :-)
Looks cool, thank you.
Thank you! This one looks exactly what I’m looking for.
🔭 Looking for good book on Octopus Behavior
Criteria: Scientific (which rules out The Soul of an Octopus), up to date (which mostly rules out Octopus: Physiology and Behaviour of an Advanced Invertebrate.
Why: I’ve heard claims that octopuses are quite intelligent, with claims going so far to attribute transmitting knowledge between individuals. I’d like to know more about how similar and different octopus behavior is from human behavior (perhaps shedding light on the space of possible minds/fragility of value).
🔭 Looking for good book/review on Universal Basic Income
Criteria: Book should be ~completely a literature review and summary of current evidence on universal basic income/unconditional cash transfers. I’m not super interested in any moral arguments. The more it talks about actual studies the better. Can be quite demanding statistically.
Why: People have differing opinions on the feasibility/goodness of universal basic income, and there’s been a whole bunch of experiment, but I haven’t been able to find a good review of that evidence.
🔭 Looking for a good textbook on Cryobiology
Criteria: The more of these properties the textbook has the better. Fundamentals of Cryobiology looks okay but has no exercises.
Why: I have signed up for cryonics, and would like to understand the debate between cryobiologists and cryonicists better.
The last person to have a case of smallpox, Ali Maow Maalin, dedicated years of his life to eradicating polio in the region.
On July 22nd 2013, he died of Malaria, while traveling again after polio had been reintroduced.
Thank you! I’ll review the pull request later today, but it looks quite useful :-)
Not sure how much free time I can spend on the forecasting library, but I’ll add the sources to the post.
Thanks! Perhaps I’ll find the time to incorporate the Metaforecast data into it sometime.
On mental health:
Since AI systems will likely have a very different cognitive structure than biological humans, it seems quite unlikely that they will develop mental health issues like humans do. There are some interesting things that happen to the characters that large-language models “role-play” as: They switch from helpful to mischievous when the right situation arises.
I could see a future in which AI systems are emulating the behavior of specific humans, in which case they might exhibit behaviors that are similar to the ones of mentally ill humans.
On addiction problems:
If one takes the concept of addiction seriously, wireheading is a failure mode remarkably similar to it.
Disagreed, animal moral patienthood competes with all the other possible interventions effective altruists could be doing, and does so symmetrically (the opportunity cost cuts in both directions!).