Postdoc in statistics. Three kids, two cats, one wife. I write about statistics, EA, psychometrics, and other things at my blog
Jonas Moss
Updating on the passage of time and conditional prediction curves
Estimating value from pairwise comparisons
A peek at pairwise preference estimation in economics, marketing, and statistics
I’ve never used Squiggle, but I imagine its main benefit is the ease of use and transparency. Consider the line
> transfer_efficiency = 0.75 to 0.9
in the Squiggle doc. In Numpy, you’d most likely have to select number samples, initiate an rng object (at least if you do as Numpy recommend), transform the quantiles (0.05,0.95)-quantiles 0.75 and 0.9 into mean and sigma, call the log-normal random generator and store them in an array, then call the appropriate plot function. Most of these steps are minor nuisances, except for the transformation of quantiles, which might be beyond the analyst’s skill level to do efficiently.
Here’s my replication in Python, which was kind of a chore to make… All of this can be done in one line in Squiggle.
import numpy as np
import scipy.stats as st
import matplotlib.pyplot as pltrng = np.random.default_rng(313)
n = 10000# Translate quantiles
a = np.log(0.75)
b = np.log(0.9)k1 = st.norm.ppf(0.05)
k2 = st.norm.ppf(0.95)
sigma = (b—a) / (k2 - k1)
mean = b—sigma * k2transfer_efficiency = np.random.lognormal(
mean=mean,
sigma=sigma,
size=n)x = np.linspace(0.7, 1, 100)
plt.plot(x, st.lognorm.pdf(x/np.exp(mean), sigma)) # Scipy’s parameterization of the log-normal is stupid. Cost me another 5 minutes to figure out how to do this one.
### It’s prudent to check if I’ve done the calculations correctly too..
np.quantile(transfer_efficiency, [0.05, 0.95]) # array([0.75052923, 0.90200089])
- What is estimational programming? Squiggle in context by 12 Aug 2022 18:01 UTC; 22 points) (
- What is estimational programming? Squiggle in context by 12 Aug 2022 18:39 UTC; 14 points) (LessWrong;
- 7 Aug 2022 0:57 UTC; 2 points) 's comment on Announcing Squiggle: Early Access by (
- 7 Aug 2022 17:46 UTC; 2 points) 's comment on Announcing Squiggle: Early Access by (
FYI: I wrote a post about the statistics used in pairwise comparison experiments of the sort used in this post.
A model about the effect of total existential risk on career choice
Great! Looking forward to reading this.
For those of us using ebook readers, there’s an .epub here https://www.smashwords.com/books/view/1134610 (Magnus, maybe add the link after the pdf?)
Thomas Hurka’s St Petersburg Paradox: Suppose you are offered a deal—you can press a button that has a 51% chance of creating a new world and doubling the total amount of utility, but a 49% chance of destroying the world and all utility in existence. If you want to maximise total expected utility, you ought to press the button—pressing the button has positive expected value. But the problem comes when you are asked whether you want to press the button again and again and again—at each point, the person trying to maximise expected utility ought to agree to press the button, but of course, eventually they will destroy everything.[2]
I have two gripes with this thought experiment. First, time is not modelled. Second, it’s left implicit why we should feel uneasy about the thought experiment. And that doesn’t work due to highly variable philosophical intuitions. I honestly don’t feel uneasy about the thought experiment at all (only slightly annoyed). But maybe I would have it been completely specified.
I can see two ways to add a time dimension to the problem. First, you could let all the presses be predetermined and in one go, where we get into Satan’s apple territory. Second, you could have 30 seconds pause between all presses. But in that case, we would accumulate massive amounts of utility in a very short time—just the seconds in-between presses would be invaluable! And who cares if the world ends in five minutes with probability when every second it survives is so sweet? :p
Do I understand you correctly here?
Each agent has a computable partial preference ordering that decides if it prefers to .
We’d like this partial relation to be complete (i.e., defined for all ) and transitive (i.e., and implies ).
Now, if the relation is sufficiently non-trivial, it will be expensive to compute for some . So it’s better left undefined...?
If so, I can surely relate to that, as I often struggle computing my preferences. Even if they are theoretically complete. But it seems to me the relationship is still defined, but might not be practical to compute.
It’s also possible to think of it in this way: You start out with partial preference ordering, and need to calculate one of its transitive closures. But that is computationally difficult, and not unique either.
I’m unsure what these observations add to the discussion, though.
Thanks for your suggestions! Big fan of yours for many years, by the way. Mating intelligence being the article collection that made we want to become an evolutionary psychologist (ended up a statistician though, mostly due to its much safer career path).
Now I noticed that I didn’t write in the post that these four points are just a summary. The meat of the post is being linked to. I think I have explained these terms in the linked post, at least graded pairwise comparisons and discrete choice models. But yeah… I will modify the summary to use less technical jargon and provide an introduction.
I think it’s important to build more connections between EA approaches to value (e.g. in AI alignment) and existing behavioral sciences methods for studying values.
Yes, and also to academia in general. I honestly didn’t think about AI alignment when writing this post, but that could be one of the applications.
Sure, if your goal is to be a good writer! But, I’m not worried about that. I just want people to understand me.
- 16 Jan 2023 17:23 UTC; 4 points) 's comment on The writing style here is bad by (
I don’t understand the relevance of the Kelly criterion. The wikipedia page for the Kelly criterion states that “[t]he Kelly bet size is found by maximizing the expected value of the logarithm of wealth,” but that’s not relevant here, is it?
Edit: I don’t endorse the arguments of this post anymore!
Your example with the sky turning green is illuminating, as it shows there is nothing super special about the event “the observer exists” in anthropic problems (at least some of them). But I don’t think the rest of your analysis is likely to be correct, as you’re looking at the wrong likelihood from the start.
In the anthropic shadow problem, as in most selection problems, we are dealing with two likelihoods.
The first is the likelihood from the bird’s-eye view. This is the ideal likelihood, with no selection at all, and the starting point of any analysis. In our case, the bird’s-eye view likelihood at the time is
where equals if the sky has turned green within time and is the number of catastrophic events up to time , and is some parameter (corresponding to in your post). From the bird’s-eye view, we observe every regardless of the outcome of , and your Bayesian analysis is correct. But we do not have the bird’s-eye view, as we only observe the s associated with !
The second likelihood is from the worm’s-eye view. To make your green-and-blue sky analogy truly analogous to the anthropic shadow, you will have to take into account that you will never be in a world with a green sky. In our case, we could suppose that worms cannot live in a world with a green sky, making a certainty. That entails conditioning on the event in the likelihood above, yielding the conditional likelihood
The likelihood from the bird’s-eye view and the likelihood from the worm’s-eye view are not the same, they do not even have the same signature. We find that the worm’s-eye view likelihood is
where is the (independent) probability of the sky turning green whenever a catastrophic event occurs.
The posterior from the bird’s-eye view is
and is independent of , as you said. However, the posterior from the worm’s-eye view is
As you can see, the factor can’t be canceled out in the integrating factor.
By the way, the likelihood proportional to is not always hard to work with. If we assume that is binomial with success probabiltiy , one can use the binomial theorem to show that the integrating constant is , yielding the normalized pmf .
Satan cuts an apple into a countable infinity of slices and offers it to Eve, one piece at a time. Each slice has positive utility for Eve. If Eve eats only finitely many pieces, there is no difficulty; she simply enjoys her snack. If she eats infinitely many pieces, however, she is banished from Paradise. To keep things simple, we may assume that the pieces are numbered: in each time interval, the choice is Take piece n or Don’t take piece n. Furthermore, Eve can reject piece n, but take later pieces. Taking any countably infinite set leads to the bad outcome (banishment). Finally, regardless of whether or not she is banished, Eve gets to keep (and eat) her pieces of apple. Call this the original version of Satan’s apple.
We shall sometimes discuss a simplified version of Satan’s apple, different from the original version in two respects. First, Eve is banished only if she takes all the pieces. Second, once Eve refuses a piece, she cannot take any more pieces. These restrictions make Satan’s apple a close analogue to the two earlier puzzles.
Problem: When should Eve stop taking pieces?
Thank you for telling about this! In economics, the discrete choice model is used to estimate a scale-free utility function in similar way. It is used in health research for estimating QALYs, among other things, see e.g. this review paper.
But discrete choice / the Schulze method should probably not be used by themselves, as they cannot give us information about scale, only ordering. A possibility, which I find promising, is to combine the methods. Say that I have ten items I want you to rate. Then I can ask “Do you prefer to ?” for some pairs and “How many times better is than ?” for other pairs, hopefully in an optimal way. Then we would lessen the cognitive load of the study participants and make it easier to scale this kind of thing up.
(The congitive load of using distributions is the main reason why I’m skeptical about having participants using them in place of point estimates when doing pairwise comparisons.)
That’s sufficient information to calculate the conditional prediction curves I’m proposing. What you need is . If you have and , which you can find by integrating the density for “when will X happen”, you can calculate .
and are indices for the causes. I wrote because you don’t have to assume that and are independent for the math to work. But everything else will have to be independent.
Maybe the uncertainties shouldn’t be independent, but often they will. Our uncertainty about the probability of AI doom is probably not related to our uncertainty about the probability of pandemics doom, for instance.
If the probability of extinction by cause is and the probability reduction for that cause is , the probability of extinction becomes if you choose to focus on cause .
I agree the argument doesn’t work, but there are at least two arguments for investing in charities with sub-optimal expected values that critically depend on time.
-
Going bust. Suppose you have two charity investments with expected values . Here , but there’s a potential for in the future, for instance since you receive better information about the charities. If you invest once, investing everything in is the correct answer since . Now suppose that each time you don’t invest in , it has a chance of going bust. Then, if you invest more than once, it would be best to invest something in if the probability of going bust is high enough and with a sufficiently high probability.
-
Signaling effects. Not investing in the charity may signal to charity entrepreneurs that there is nothing to gain by starting in a new charity similar to , thus limiting your future pool of potential investments. I can imagine this to be especially important if your calculation of the expected value is contentious, or if has high epistemic uncertainty.
Edit: I think “going bust” example is similar to the spirit of the Kelly criterion, so I suppose you might say the argument does work.
-
I agree that academic language should be avoided in both forums and research papers.
It might be a good idea for forum writers to use a tool like ChatGPT to make their posts more readable before posting them. For example, they can ask ChatGPT to “improve the readability” of their text. This way, writers don’t have to change their writing style too much and can avoid feeling uncomfortable while writing. Plus, it saves time by not having to go back and edit clunky sentences. Additionally, by asking ChatGPT to include more slang or colloquial language, the tool can better match the writer’s preferred style. (Written with the aid of ChatGPT in exactly the way I proposed. :p)