Postdoc in statistics. Three kids, two cats, one wife. I write about statistics, EA, psychometrics, and other things at my blog
Jonas Moss
I don’t understand your notion of context here. I’m understanding pairwise comparisons as standard decision theory—you are comparing the expected values of two lotteries, nothing more. Is the context about psychology somehow? If so, that might be interesting, but adds a layer of complexity this sort of methodology cannot be expected to handle.
Players may have different utility functions, but that might be reasonable to ignore when modelling all of this. In any case, every intervention will have its own, unique, expected utility from each player , hence . (This is ignoring noise in the estimates, but that is pretty easy to handle.)
Estimation is actually pretty easy (using linear regression), and is essentially a solved problem since 1952. Scheffé, H. (1952). An Analysis of Variance for Paired Comparisons. Journal of the American Statistical Association, 47(259), 381–400. https://doi.org/10.1080/01621459.1952.10501179
I wrote about the methodology (before finding Scheffé′s paper) here.
Do I understand you correctly here?
Each agent has a computable partial preference ordering that decides if it prefers to .
We’d like this partial relation to be complete (i.e., defined for all ) and transitive (i.e., and implies ).
Now, if the relation is sufficiently non-trivial, it will be expensive to compute for some . So it’s better left undefined...?
If so, I can surely relate to that, as I often struggle computing my preferences. Even if they are theoretically complete. But it seems to me the relationship is still defined, but might not be practical to compute.
It’s also possible to think of it in this way: You start out with partial preference ordering, and need to calculate one of its transitive closures. But that is computationally difficult, and not unique either.
I’m unsure what these observations add to the discussion, though.
Some comments:
Have you considered hiring a designer for this document? It doesn’t look good at all and is filled up with bold faces all over the place.
Why is it so long? I don’t see why it’s important for vegans to know that cows are supplemented with vitamin B12.
It could have benefited a lot from lists of key takeaways. For instance, do you need to take vitamin D3 supplementation, and how much? Much of the document feels like an info dump to me.
Sure, if your goal is to be a good writer! But, I’m not worried about that. I just want people to understand me.
- Jan 16, 2023, 5:23 PM; 4 points) 's comment on The writing style here is bad by (
As far as I can recall, my paragraphs are usually about half as long when I ask ChatGPT to simplify.
That said, I tend to write in an academic style.
I agree that academic language should be avoided in both forums and research papers.
It might be a good idea for forum writers to use a tool like ChatGPT to make their posts more readable before posting them. For example, they can ask ChatGPT to “improve the readability” of their text. This way, writers don’t have to change their writing style too much and can avoid feeling uncomfortable while writing. Plus, it saves time by not having to go back and edit clunky sentences. Additionally, by asking ChatGPT to include more slang or colloquial language, the tool can better match the writer’s preferred style. (Written with the aid of ChatGPT in exactly the way I proposed. :p)
If I understand you correctly, what you’re proposing is essentially a subset of classical decision theory with bounded utility functions. Recall that, under classical decision theory, we choose our action according to where is a random state of nature and an action space.
Suppose there are (infinitely many works too) moral theories , each with probability and associated utility . Then we can define This step gives us (moral) uncertainty in our utility function.
Then, as far as I understand you, you want to define some component utility functions as As then is the probability of an acceptable outcome under . And since we’re taking the expected value of these bounded component utilities to construct , we’re in classical bounded utility function land.
That said, I believe that
This post would benefit from a rewrite of the paragraph starting with “Success maximization is a mechanism by which to generalize maxipok”. It states ” Let be an action from the set of actions . ” Is and action, and action, or both? I also don’t understand what is. Are there states of nature in this framework? You say that is a moral theory, so it cannot be ?
You should add concrete examples. If you add one or two it might become easier to understand what you’re doing despite the formal definition not being 100% clear.
Thanks for writing this.
I wrote about “decay of predictions” here. I would classify the problem as hard.
Do you have a feeling for how suitable the projects are for academic projects? Such as bachelor theses or master theses, perhaps? It would be great to show a list of projects to students!
Could you elaborate?
Sorry, but I don’t understand what you mean.
Here’s the context I’m thinking about. Say you have two options and . They have different true expected values and . The market estimates their expectations as and . And you (or the decider) choose the option with highest estimated expectation. (I was unclear about estimation vs. true values in my previous comment.)
Does this have something to do with your remarks here?
Also, there’s always a way to implement “the market decides”. Instead of asking P(Emissions|treaty), ask P(Emissions|market advises treaty), and make the market advice = the closing prices. This obviously won’t be very helpful if no-one is likely to listen to the market, but again the point is to think about markets that people are likely to listen to.
Potential outcomes are very clearly and rigorously defined as collections of separate random variables, there is no “I know it when I see it” involved. In this case you choose between two options, and there is no conditional probability involved unless you actually need it for estimation purposes.
Let’s put it a different way. You have the option of flipping two coins, either a blue coin or a red coin. You estimate the expected probability of heads as and . You base your choice of which coin to toss on which probability is the largest. There is actually no need to use scary-sounding terms like counterfactuals or potential outcomes at all, you’re just choosing between random outcomes.
We could create a separate market on how the decision market resolves, and it will resolve unambiguously.
That sounds like an unnecessarily convoluted solution to a question we do not need to solve!
However we deal with that, I expect the story ends up sounding quite similar to my original comment—the critical step is that the choice does not depend on anything but the closing price.
Yes, I agree. And that’s why I believe we shouldn’t use conditional probabilities at all, as it makes it confusion possible.
In this case it would be best to use the language of counterfactuals (aka potential outcomes) instead of conditional expectations. In practice, the market would estimate and for the two random functions and , and you would choose the option with the highest estimated expected value. There is no need to put conditional probability into the mix at all, and it’s probably best not to, as there is no obvious probability to assign to the “events” and .
Satan cuts an apple into a countable infinity of slices and offers it to Eve, one piece at a time. Each slice has positive utility for Eve. If Eve eats only finitely many pieces, there is no difficulty; she simply enjoys her snack. If she eats infinitely many pieces, however, she is banished from Paradise. To keep things simple, we may assume that the pieces are numbered: in each time interval, the choice is Take piece n or Don’t take piece n. Furthermore, Eve can reject piece n, but take later pieces. Taking any countably infinite set leads to the bad outcome (banishment). Finally, regardless of whether or not she is banished, Eve gets to keep (and eat) her pieces of apple. Call this the original version of Satan’s apple.
We shall sometimes discuss a simplified version of Satan’s apple, different from the original version in two respects. First, Eve is banished only if she takes all the pieces. Second, once Eve refuses a piece, she cannot take any more pieces. These restrictions make Satan’s apple a close analogue to the two earlier puzzles.
Problem: When should Eve stop taking pieces?
I think the StackExchange sites have automatic reminders, or maybe even checks, of similar stuff. My last post on cross-validated (stack exchange for statistics) had hints about reproducible examples, I think.
Gwern has a writing checklist. Similar checklists could be forced on the author prior to submission.
Thanks for your suggestions! Big fan of yours for many years, by the way. Mating intelligence being the article collection that made we want to become an evolutionary psychologist (ended up a statistician though, mostly due to its much safer career path).
Now I noticed that I didn’t write in the post that these four points are just a summary. The meat of the post is being linked to. I think I have explained these terms in the linked post, at least graded pairwise comparisons and discrete choice models. But yeah… I will modify the summary to use less technical jargon and provide an introduction.
I think it’s important to build more connections between EA approaches to value (e.g. in AI alignment) and existing behavioral sciences methods for studying values.
Yes, and also to academia in general. I honestly didn’t think about AI alignment when writing this post, but that could be one of the applications.
A peek at pairwise preference estimation in economics, marketing, and statistics
Thomas Hurka’s St Petersburg Paradox: Suppose you are offered a deal—you can press a button that has a 51% chance of creating a new world and doubling the total amount of utility, but a 49% chance of destroying the world and all utility in existence. If you want to maximise total expected utility, you ought to press the button—pressing the button has positive expected value. But the problem comes when you are asked whether you want to press the button again and again and again—at each point, the person trying to maximise expected utility ought to agree to press the button, but of course, eventually they will destroy everything.[2]
I have two gripes with this thought experiment. First, time is not modelled. Second, it’s left implicit why we should feel uneasy about the thought experiment. And that doesn’t work due to highly variable philosophical intuitions. I honestly don’t feel uneasy about the thought experiment at all (only slightly annoyed). But maybe I would have it been completely specified.
I can see two ways to add a time dimension to the problem. First, you could let all the presses be predetermined and in one go, where we get into Satan’s apple territory. Second, you could have 30 seconds pause between all presses. But in that case, we would accumulate massive amounts of utility in a very short time—just the seconds in-between presses would be invaluable! And who cares if the world ends in five minutes with probability when every second it survives is so sweet? :p
I don’t understand the relevance of the Kelly criterion. The wikipedia page for the Kelly criterion states that “[t]he Kelly bet size is found by maximizing the expected value of the logarithm of wealth,” but that’s not relevant here, is it?
I’m not sure what you mean. I’m thinking about pairwise comparisons in the following way.
(a) Every pair of items i,j have a true ratio of expectations E(Xi)/E(Xj)=μij. I hope this is uncontroversial. (b) We observe the variables Rij according to logRij∼logμij+ϵij for some some normally distributed ϵij. Error terms might be dependent, but that complicates the analysis. (And is most likely not worth it.) This step could be more controversial, as there are other possible models to use.
Note that you will get a distribution over every E(Xi) too with this approach, but that would be in the Bayesian sense, i.e., p(E(Xi)∣comparisons), when we have a prior over E(Xi).