Founded Northwestern EA club. Studied Math and Econ.
Starting a trading job in a few months and self-studying python. Talk to me about cost benefit analysis !
Founded Northwestern EA club. Studied Math and Econ.
Starting a trading job in a few months and self-studying python. Talk to me about cost benefit analysis !
I don’t remember specifics but he was looking if you could make certain claims on models acting a certain way on data not in the training data based on the shape and characteristics about the training data. I know that’s vague sorry, I’ll try to ask him and get a better summary.
It seems plausible that there are ≥100,000 researchers working on ML/AI in total. That’s a ratio of ~300:1, capabilities researchers:AGI safety researchers.
Barely anyone is going for the throat of solving the core difficulties of scalable alignment. Many of the people who are working on alignment are doing blue-sky theory, pretty disconnected from actual ML models.
One question I’m always left with is: what is the boundary between being an AGI safety researcher and a capabilities researcher?
For instance, My friend is getting his PhD in machine learning, he barely knows about EA or LW, and definitely wouldn’t call himself a safety researcher. However, when I talk to him, it seems like the vast majority of his work deals with figuring out how ML systems act when put in foreign situations wrt the training data.
I can’t claim to really understand what he is doing but it sounds to me a lot like safety research. And it’s not clear to me this is some “blue-sky theory”. A lot of the work he does is high-level maths proofs, but he also does lots of interfacing with ml systems and testing stuff on them. Is it fair to call my friend a capabilities researcher?
So I can choose then?
Yes. but I think to be very specific, we should call the problems A and B (for instance, the quiz is problem A and the exam is problem B), and a choice to work on problem A equates to spending your resource [1]on problem A in a certain time frame. We can represent this as where {i} is the period in which we chose a and {j} is the number of times we have picked a before. j is sorta irrelevant for problem A since we only can use one resource max to study but relevant for problem B to represent the diminishing returns via .
What do we mean by ‘last’? Do you mean that the choice in period 1, , yields benefits (or costs) in periods 1 and 2, while the choice in period 2, , only affects outcomes in period 2?
Neither if I’m understanding you correctly. I mean that the Scale of problem A in period 2, , is 0. This also implies that the marginal utility of working on problem A in period 2 is 0. For instance, if I study for my quiz after it happens this is worthless. This is different from the diminishing returns that are at play when repeatedly studying for the same exam.
This is the extreme end of the spectrum though. We can generalize this by acknowledging that the marginal utility of a certain problem is a function of time. For instance, it’s better to knock on doors for an election the day before than 3 years before but probably not infinitely better.
Can you define this a bit? Which ‘choices’ have different scale, and what does that mean?
I think I maybe actually used scale as both meaning MU/resource and as meaning: if we solve the entire problem, how much is that worth? Basically, importance, as described in the ITN framework, except maybe I didn’t mean it as a function of the percent of work done and rather the total. Generally though, I think people consider this to be a constant (which I’m not sure they should...) but this being the case, we are basically talking about the same thing but they are dividing by a factor of 100, which again doesn’t matter for this discussion.
I think what Eliot meant is importance, so that’s what I’m going to define it as, but I think you picked up on this confusion which is my bad.
By choices, I meant the problems, like the quiz or the exam. I think I used the incorrect wording here though since choices also denote a specific decision to spend a resource on a problem. My fault for the confusion.
Maybe you want to define the sum of benefits
E.g.,
,
,
,
,
Yes basically but I think that
and
and
are better notations. although it doesn’t really matter, I got what you were saying.
where a and b are positive numbers, and is a diminishing returns parameter?
essentially yes but with my notation.
For ‘different scale’ do you just mean something like ?
No. taking b to mean , b is the marginal utility of spending a resource in period 1 on problem B, not the total utility to be gained by solving problem b. Using the test example the scale of B is either since this is the maximum grade I can achieve based on the convergent geometric sum described or 20% since this is the maximum grade total although maybe it’s literally impossible for me to reach this. I’m not actually sure which to use, but I guess let’s go with 20%, and denote a convergent sum as meaning .
What I meant was or 20% > 10% in the test example
So this is like, above, if
I think this was the point I was trying to make with the examples I gave to you. Basically that the decision at t = 1 in a sequence of decisions that maximizes utility over multiple periods of time is not the same as the decision that maximizes utility at t= 1, which is what I believe you are pointing out here. In effect
But actually, I think the claim I originally made in response to him was actually a lot simpler than that, more along the lines of “A problem being urgent does not mean that its current scale is higher than if it was not urgent”. taking U(Ai) to be the Scale of problem A in the ith period, and taking problem A to be urgent to mean , which I’m getting from the op saying
Some areas can be waited for a longer time for humans to work on, name it, animal welfare, transhumanism.
my original claim in response to Elliot is something like
and does not imply
where
The fact that I get no value out of studying for a Monday quiz on Tuesday doesn’t mean the quiz is now worth more than 10% of my grade. On the flip side if the quiz was moved to Wednesday It would still be worth 10% of my grade.
I think it was maybe not what Eliot meant. That being said, taking his words literally I do think this is what he implied. I’m not really sure honestly haha.
But that’s not just ‘because a has no value in period ’ but also because of the diminishing returns on b (otherwise I might just choose b in both periods.
Correct. I think there are further specifications that might make my point less niche, but I’m not sure.
As an aside, I’m not sure I’m correct about any of this but I do wish the forum was a little more logic and math-heavy so that we could communicate better.
we could model a situation where you have multiple resources in every period but here I choose to model as if you have a single resource to spend in each period
What do you mean by
the rate at which they will grow or shrink over time.
specifically what mathematical quantity is “they”
I don’t full comprehend why we can’t include it. It seems like the ITN framework does not describe the future of the marginal utility per resource spent on the problem but rather the MU/resource right now. If we want to generalize the ITN framework across time, which theoretically we need to do to choose a sequence of decisions, we need to incorporate the fact that tractability and scale are functions of time (and even further the previous decisions we make).
all this is going to do is change the resulting answer from (MU/$) to MU/$(t), where t is time. everything still cancels out the same as before. In practice I don’t know if this is actually useful.
Do you know if anyone else has written more about this?
The more I think about this the more confused I get… Going to formalize and answer your questions but it might not be done till tomorrow.
Ok so I’m trying to come up with an example where
You have two choices
You can only pick one of the choices per unit of time
One of the choices will last two units of time, the other will last one
The choices have different scale, but this difference has nothing to do with lockin, and you should pick the choice with less scale because it only is available for one unit of time
I think an example that perfectly reflects this is hard to come up with, but there are many things that are close.
I have a quiz on Monday worth 10% of my grade and a test on Friday worth 20%. The intersection of materials on both exams is the null set. I have enough time between Monday and Friday to study for the test and hit sufficiently diminishing returns such that the extra day of studying on Monday will increase my test grade less than 1⁄2 of how much studying for the quiz would increase my quiz grade.
I’m a congressman, and I have two bills that I’m writing, Gun Control and immigration. Gun control needs to be finished by Monday, and Immigration needs to be finished by friday. The rest of this example is the same as above.
I go to school with Isaac Newton, and convincing him to go into AI safety will provide 2 utility. I also realize there isn’t a ea club at my school and starting the club will provide 1 utility. I know that Isaac isn’t applying to jobs for a few months, and I only need 1 hour of his time to hit dimishing returns on increasing his chance of going into AI safety. The deadline for starting a club next year is tomorrow.
Of course, these vacuums are still underspecified. Opportunity cost is rarely just about trading off 2 objects—in the real world we have many options. I think you are thinking more about cause areas and I’m thinking more about specific interventions. However, I think this could extend more broadly but it would be more confusing to work out.
was this meant to be a response to my comment? I can’t tell. If so I’ll try to come up with some examples
I agree with your assessment that Vasco’s comment is not really on topic.
I also feel like there is a lack of substantive discussion and just overall engagement on the forum (this post and comment section being an exception).
I’m not exactly sure why this is (maybe there just aren’t enough EA’s) but it seems related to users being worried that their comments might not add value combined with the lack of anonymity and in-group dynamics. In general I find hacker news and subreddits like r/neoliberal to be significantly more thought provoking and engaging even though I think the commenters of those subs are often engaging more hedonistically and less to add value. On the margin the EA forum should be more serious and have stricter norms than those communities, but I’m worried that forum users optimizing individual posts and comments for usefulness is lowering the overall usefulness of the forum.
You can only press one button per year due to time/resource/ etc constraints. Moreover you can only press each button once.
No I wasn’t
FYI I edited the comment slightly, but it doesn’t change anything. Can you explain how the urgency of the button presses relates to the scale?
Let’s say I can press button a, which will create 1 utility or button b which will create 2 utility.
Button a is only press-able for the next year while button b is press-able for the next two years.
In this example I believe the scale has nothing to do with the urgency.
I would say we are basically on the exact same page in terms of the overall vision. I’m also trying to get at these logical chains of information that we can travel backwards through to easily sanity check and also do data analysis.
Where I think we break is if there is no underlying structure to these logical chains outside of a bunch of arrows pointing between links, it reduces our ability to automate and take away insights.
A few examples
you link to a ea forum post with multiple claims. In order to build logical chains, we now need a database to store each claim in each post. In order to do this, we now need to convince everyone to use certain formatting on claims or try to use an LLM to parse.
you link multiple sources, which themselves link multiple sources. Since linking is just drawing arrows in an abstract sense, I have no ability to discern how much each source went into the guess. I assume we would just use a uniform distribution to model how much each source went into the final guess? but this is clearly terribly off in many cases so we lose a lot of information.
If we link to models we hold a lot more information down the chain.
Overall I wouldn’t say my proposition isn’t a full substitute for your idea, but I think there is overlapping functionality.
Few Things
I only skimmed your post (let me know if I’m misunderstanding) but I have an issue with this idea. Many forecasts require complicated mathematical models to describe. You can’t simply link to sources. You also need to link to a model. Blog posts/txt files, which are essentially what the forum is, are extremely hard to scrape and parse unless everyone starts adopting conventions. So you max you functionality out at linking, this isn’t very automated.
If you are recommending connecting a full mathematical model from the forum, let me suggest that rather than connecting Metaculus to the forum, you connect it to https://www.getguesstimate.com/models, as this is much more scalable and clear.
thank you for thinking about these things, it inspired me to make my own post.
Theoretical idea that could be implemented into Metaculus
tldr; add an option to submit models of how to forecast a question, and also voting on the models.
To be more concrete, when someone submits a question, in addition to forecasting the question, you can submit a squiggle—or just plain mathematical model—of your best current guess of how to approach the problem. You define each subcomponent that is important in the final forecast and also how these subcomponents combine into the final forecast. Each subcomponent automatically becomes another forecasting question on the site that people can do the same to (if it is not already one).
Then in addition to a normal forecast, as we do right now, people can also forecast the subcomponents of the models, as well as vote on the models. If a model already includes previously forecasted questions, they automatically populate in the model.
The voting system on models could either just draw attention to the best models and encourage forecasting of the subcomponents, or even weight the models estimates into the overall forecast of the question. No idea if this would improve forecasting but it might make it more transparent and scalable.
I wrote a bit more in this google doc if interested.
edit: I think this might just be guesstimate with memoization
I love this!
Sort of an aside but it would be really lovely if we could build a database with every prediction on Metaculus and also tons of estimates from other sources (academia, etc) so that people could do squiggle-type estimations that source all the constants.
I definitely have very little idea what I’m talking about but I guess part of my confusion is inner alignment seems like a capability of ai? Apologies if I’m just confused.