Doesn’t look like it
MathiasKB🔸
This is what my frontpage looks like
Naively, is there a case for using the average of the two?
[Linkpost] A Narrow Path—How to Secure our Future
Two ideas off the top of my head
Distribution of charitable goods (such as insecticide nets or cash-transfers) and its effect on economic growth of the region/country
Proportion of people identifying as flexitarian/vegetarian/vegan and the effects on sales of plant-based products.
EDIT: Someone on lesswrong linked a great report by Epoch which tries to answer exactly this.
With the release of openAI o1, I want to ask a question I’ve been wondering about for a few months.
Like the chinchilla paper, which estimated the optimal ratio of data to compute, are there any similar estimates for the optimal ratio of compute to spend on inference vs training?In the release they show this chart:
The chart somewhat gets at what I want to know, but doesn’t answer it completely. How much additional inference compute would I need a 1e25 o1-like model to perform as well as a one shotted 1e26?
Additionally, for some x number of queries, what is the optimal ratio of compute to spend on training versus inference? How does that change for different values of x?
Are there any public attempts at estimating this stuff? If so, where can I read about it?
Thanks for all of the hard work you put into developing and maintaining it!
Thanks, this was quite informative. Would love to read the full report, but $1000 is a bit steep!
Without having read the letter yet, why do you find it questionable?
Good question, not sure how I get it into my email actually, I can’t find it on the website either
edit: I think it’s through the forecasting newsletter
I can highly recommend following Sentinel’s weekly minutes, a weekly update from superforecasters on the likelihood of any events which plausibly could cause worldwide catastrophe.
Perhaps the weekly newsletter I look the most forward to at this point. Read previous issues here:
Jeff, your notes on NAO are fascinating to read! I have nothing to add other than that I hope you keep posting them
Hi Ian,
Thanks for the question! I’ve been meaning to write down my thoughts on this for a while, so here is a longer perspective:
In 2015 USAID teamed up with Givewell to cash-benchmark one of its programs. The evidence came back showing that cash-transfers outperformed the program on every metric. What gets brought up less often is that the programme got its funding renewed shortly after anyways! The cash-benchmark alone was not sufficient, you also need some policy to require programs worse than cash should be wound down.
This is a sentiment I’m fully behind. But what exactly that policy should look like is where it gets tricky.
How should the ministry cash benchmark a music festival in Mali?[1] What is the cash-benchmark for a programme to monitor the Senegalese election to ensure a fair election? If the cash-benchmark should only be for certain types of programming amenable to cash comparisons, such as global health, how will that shift funding?
I worry that instituting a selective high bar will move funding from away broadly cost-effective areas which can be benchmarked against cash, to broadly ineffective areas which can’t be easily benchmarked against cash.
But even within areas amenable to cash-benchmarking, it’s unclear what the policy should look like. How should the ministry cash-benchmark its funding to a large multilateral which will go to fund a thousand programmes across the world?
The answer to this, which many arrive at is: “Cleary we need to move from demanding literal cash-arms to just making estimates of how impactful programmes and organizations are compared to cash-transfers. That way we still get the nice hurdle-rate that programmes must be compared against, which is what we were really after anyways”
But that development ministries should systematically estimate and compare the impact of projects is what development economists have been shouting for decades!
To an extent, the ministry’s lack of systematic measurement and comparison is a feature not a bug. Almost any instantiation of cash-benchmarking removes wriggle room to fund projects which are valuable for reasons you didn’t want to state out loud. From a ministers perspective, cash-benchmarking doesn’t solve any problems, it creates one!
- ^
This is not a facetious example, but a real project funded by the Norwegian government.
- ^
As a side note, one thing I find amusing is just how much it sucks to announce your org’s shut down after Maternal Health Initiative set the bar so ridiculously high.
Even at shutting down they have us beat!
Center for Effective Aid Policy has shut down
Any ideas for what we can do to improve it?
The whole manifund debacle has left me quite demotivated. It really sucks that people are more interested debating contentious community drama, than seemingly anything else this forum has to offer.
Why are seitan products so expensive?
I strongly believe in the price, taste, convenience hypothesis. If/when non-animal foods are cheaper and tastier, I expect the west to undergo a moral cascade where factory farming in a very short timespan will go from being common place to illegal. I know that in the animal welfare space, this view point is often considered naive, but I remain convinced its true.
My mother buys the expensive vegan mayonnaise because it’s much tastier than the regular mayonnaise. I still eat dairy and eggs because the vegan alternatives suck.
What I don’t understand is why vegan alternatives have proven so difficult to make cheap and tasty. Are there any good write ups on this?
Like when I go to a supermarket in Copenhagen, every seitan product will charge a significant markup over the raw cost of the ingredients (Amazon will sell you kilos of seitan flour at very little cost).
Do consumers have sufficiently inelastic preferences that a small market high-markup is the most profitable strategy? Is the final market too small for producers to reach economies of scale for seitan, or is it just difficult to bootstrap?
I would love to better understand what the demand curves look like for various categories of vegan products, as I really can’t wrap my mind around how the current equilibrium came about
I’m keeping an eye out for Sentinel’s analyses: https://forecasting.substack.com/p/alert-minutes-for-week-172024
I’m worried too!
I strongly upvoted this post because I’m extremely interested in seeing it get more attention and, hopefully, a potential rebuttal. I think this is extremely important to get to the bottom of!
At first glance your critiques seem pretty damning, but I would have to put a bunch of time into understanding ACE’s evaluations first before I would be able to conclude whether I agree your critiques (I can spend a weekend day doing this and writing up my own thoughts in a new post if there is interest).
My expectation is that if I were to do this I would come out feeling less confident than you seem to be. I’m a bit concerned that you haven’t made an attempt at explaining why ACE might have constructed their analyses this way.
But like I’m pretty confused too. It’s hard to think of much justification for the choice of numbers in the ‘Impact Potential Score’ and deciding the impact of a book based on the average of all books doesn’t seem like the best way to approach things?