Javier Prieto🔸

Karma: 333

Javier Prieto🔸 12 Sep 2024 5:48 UTC
3 points
0 ∶ 0
on: Giv Effektivt (DK) need ~110 more members to be able to offer tax deductions of around $66.000)
I tried to sign up but the payment step keeps giving an error. This happens both when I enter my card details and with Google Pay.

Javier Prieto🔸 15 Nov 2023 22:58 UTC
2 points
0 ∶ 0
on: Kids or No Kids
Thanks for writing this!
Since your decision seems to come down to the expected positive effect on your happiness, I’m curious whether you considered even cheaper happiness-boosting interventions. For example, hundreds (thousands?) of hours of meditation might give you the “love, belonging, connection” and “personal growth” benefits with fewer downsides, though this might work less reliably than having kids.

Javier Prieto🔸 2 Aug 2023 10:38 UTC
3 points
0 ∶ 0
in reply to: Peter Mühlbacher’s comment on: Takeaways from the Metaculus AI Progress Tournament
Thanks, Peter!
To your questions:
1. I’m fairly confident (let’s say 80%) that Metaculus has underestimated progress on benchmarks so far. This doesn’t mean it will keep doing so in the future because (i) forecasters may have learned from this experience to be more bullish and/or (ii) AI progress might slow down. I wouldn’t bet on (ii), but I expect (i) has already happened to some extent—it has certainly happened to me!
2. The other categories have fewer questions and some have special circumstances that make the evidence of bias much weaker in my view. Specifically, the biggest misses in “compute” came from GPU price spikes that can probably be explained by post-COVID supply chain disruptions and increased demand from crypto miners. Both of these factors were transient.
3. I like your example with the two independent dice. My takeaway is that, if you have access to a prior that’s more informative than a uniform distribution (in this case, “both dice are unbiased so their sum must be a triangular distribution”), then you should compare your performance against that. My assumption when writing this was that a (log-)uniform prior over the relevant range was the best we could do for these questions. This is in line with the fact that Metaculus’s log score on continuous questions is normalized using a (log-)uniform distribution.
4. That’s a good point re: different time horizons. I didn’t bother to check the average time between close and resolution for all questions on the platform, but, assuming it’s <<1 year as you suggest, I agree it’s an important caveat. If you know that number off the top of your head, I’ll add it to the post.

Javier Prieto🔸 31 Jul 2023 9:42 UTC
3 points
0 ∶ 0
in reply to: Lukas_Gloor’s comment on: Takeaways from the Metaculus AI Progress Tournament
That’s right. When defined using a base 2 logarithm, the score can be interpreted as “bits of information over the maximally uncertain (uniform) distribution”. Forecasts assigning less probability mass to the true outcome than the uniform distribution result in a negative score.

Takeaways from the Metaculus AI Progress Tournament

Javier Prieto🔸27 Jul 2023 14:37 UTC

85 points

6 comments4 min readEA link

Javier Prieto🔸 19 Jul 2023 22:18 UTC
3 points
0 ∶ 0
in reply to: LMF’s comment on: Some thoughts on quadratic funding
Have you considered contacting the authors of the original QF paper? Glenn and Vitalik seem quite approachable. You could also post the paper on the RxC discord or (if you’re willing to go for a high-effort alternative) submit it to their next conference.

Javier Prieto🔸 12 Jul 2023 19:31 UTC
6 points
0 ∶ 0
on: Some thoughts on quadratic funding
Thanks for writing this up!
I think your (largely negative) results on QF under incomplete information should be more widely known. I consider myself to be relatively “plugged” into the online communities that have discussed QF the most (RxC, crypto, etc.) and I only learned about your paper a couple of months ago.
Here are a few more scattered thoughts prompted by the post:
1. I’m really intrigued by the dynamic setting and its potential to alleviate the information problem to some extent. I agree there should be more work on this, theoretical or empirical.
2. Showing endogenous CQF is (in)efficient under complete information sounds relatively easy, right? I would love it if someone did this or explained why my intuition about hardness is wrong! (Though I expect an eventual efficiency proof wouldn’t go through under incomplete information for the same reasons as in regular QF, so I’m not sure how useful this is in practice.)
3. Agree with all your points on matching and coordination – the mechanism doesn’t seem to be a good fit there.
4. In the section on grantmaking, you seem to assume that experts wouldn’t be paying out of their own pockets, but this could be implemented with the following setup: the donor gives them a regranting pot that they can keep for themselves or spend on other projects that will be matched quadratically.
5. I didn’t know quadratic voting is efficient under incomplete information. Add that to its other advantages (simplicity, budget, etc.) and it comes out as a much stronger option than QF. I have no take on whether it’s better or worse than the other mechanisms you mention, though my sense is that approval voting is the darling of many electoral reform wonks.

Javier Prieto🔸 27 Jun 2023 17:22 UTC
1 point
0 ∶ 0
in reply to: Julia_Wise🔸’s comment on: Decision-making and decentralisation in EA
I’ve been thinking about regranting on and off for about a year, specifically about whether it makes sense to use bespoke mechanisms like quadratic funding or some of its close cousins. I still don’t know where I land on many design choices, so I won’t say more about that now.
I’m not aware of any retrospective on FTXFF’s program but it might be a good idea to do it when we have enough information to evaluate performance (so in 6-12 months?) Another thing in this vein that I think would be valuable and could happen right away is looking into SFF’s S-process.

Javier Prieto🔸 5 May 2023 11:58 UTC
1 point
0 ∶ 0
on: AI risk/reward: A simple model
Cool app!
Are you pulling data from Manifold at all or is the backend “just” a squiggle model? If the latter, did you embed the markets by hand or are you automating it by searching the node text on Manifold and pulling the first market that pops up or something like that?

Javier Prieto🔸 11 Apr 2023 10:16 UTC
1 point
0 ∶ 0
in reply to: Diego Oliveira 🔸’s comment on: How accurate are Open Phil’s predictions?
Thanks! That’s a reasonable strategy if you can choose question wording. I agree there’s no difference mathematically, but I’m not so sure that’s true cognitively. Sometimes I’ve seen asymmetric calibration curves that look fine >50% but tend to overpredict <50%. That suggests it’s easier to stay calibrated in the subset of questions you think are more likely to happen than not. This is good news for your strategy! However, note that this is based on a few anecdotal observations, so I’d caution against updating too strongly on it.

Javier Prieto🔸 6 Apr 2023 0:58 UTC
10 points
4 ∶ 0
in reply to: NunoSempere’s comment on: Wisdom of the Crowd vs. “the Best of the Best of the Best”
Glad you brought up real money markets because the real choice here isn’t “5 unpaid superforecasters” vs “200 unpaid average forecasters” but “5 really good people who charge $200/h” vs “200 internet anons that’ll do it for peanuts”. Once you notice the difference in unit labor costs, the question becomes: for a fixed budget, what’s the optimal trade-off between crowd size and skill? I’m really uncertain about that myself and have never seen good data on it.

Javier Prieto🔸 1 Mar 2023 19:28 UTC
11 points
0 ∶ 0
on: Scoring forecasts from the 2016 “Expert Survey on Progress in AI”
Great analysis!
I wonder what would happen if you were to do the same exercise with the fixed-year predictions under a ‘constant risk’ model, i.e.P(t) = 1 - exp(-l*t) with l = - year / log(1 - P(year)), to get around the problem that we’re still 3 years away from 2026. Given that timelines are systematically longer with a fixed-year framing, I would expect the Brier score of those predictions would be worse. OTOH, the constant risk model doesn’t seem very reasonable here, so the results wouldn’t have a straightforward interpretation.

Javier Prieto🔸 30 Jan 2023 21:33 UTC
11 points
2 ∶ 0
on: FIRE & EA: Seeking feedback on “Fi-lanthropy” Calculator
This is really cool! As someone who’s been doing these calculations in a somewhat haphazard way using a mix of pen and paper, spreadsheets, and Python scripts for years, it’s nice to see someone put in the work to create a polished product that others can use.
Something that I’ve been meaning to incorporate to my estimates and that would be a killer feature for an app like this is a reasonable projection of future earnings, under the assumption that you’ll get promoted / switch career paths at the average rate for someone in your current position. Sprinkle a bit of uncertainty on top, and you can get out a nice probability distribution over “time at FI” and “total money donated”.

Javier Prieto🔸 14 Dec 2022 20:33 UTC
0 points
0 ∶ 0
in reply to: NicoleJaneway 🔸’s comment on: Personal Finance for EAs
Thanks for the tip!

Javier Prieto🔸 14 Dec 2022 18:54 UTC
4 points
1 ∶ 0
on: Personal Finance for EAs
A product I would personally like to see because it’d be tremendously useful to me is “personal finance for nomadic EAs” i.e. if location is at most a minor constraint for you, where should you move to maximize the resources available to the effective charities of your choice? I expect that, for most people without such constraints, packing up and leaving is probably much more effective than fine-tuning the strategy to the place where they currently reside.

Javier Prieto🔸 16 Nov 2022 13:14 UTC
9 points
0 ∶ 0
on: Some research ideas in forecasting
Your likelihood_pool method is returning Brier scores >1. How is that possible? Also, unless you extremize, it should yield the same aggregates (and scores) as regular geometric mean of odds, no?

Javier Prieto🔸 9 Sep 2022 14:53 UTC
1 point
0 ∶ 0
on: Cause area: Short-sleeper genes
Thanks for posting this! I think this topic is extremely neglected and the lack of side effects among natural short-sleepers strongly suggests that there could be interventions with no obvious downsides.
My main concern with your drug-centered approach is: what if the causal path from short-sleeper genes to a short-sleeper phenotype flows through nerodevelopmental pathways, such that once neural structures are locked-in in adulthood it’s not possible to induce the desired phenotype by mimicking the direct effects of the genes? If this is true, then reaping the benefits of short-sleeper genes would seem to require genetic engineering (I doubt embryo selection would scale given the low frequency of the target alleles). This would obviously be politically problematic and I’m not sure it’d be technically feasible right away (last time I checked, CRISPR people were worried about off-target mutations, but I’m not up to date with that literature so this may not be an issue anymore).

Javier Prieto🔸 24 Aug 2022 18:03 UTC
6 points
0 ∶ 0
on: Open Phil is seeking bilingual people to help translate EA/EA-adjacent web content into non-English languages
Have you considered holding out some languages at random to assess the impact of the program? You could e.g. delay funding for some languages by 1-2 years and try to estimate the difference in some relevant outcome during that period. I understand this may be hard or undesirable for several reasons (finding and measuring the right outcomes, opportunity costs, managing grantee expectations).

A Critique of The Precipice: Chapter 6 - The Risk Landscape [Red Team Challenge]

Sarah Weiler26 Jun 2022 10:59 UTC

57 points

2 comments21 min readEA link

Javier Prieto🔸 21 Jun 2022 16:45 UTC
2 points
0 ∶ 0
in reply to: Dan_Keys’s comment on: How accurate are Open Phil’s predictions?
We do track whether predictions have a positive (“good thing will happen”) or negative (“bad thing will happen”) framing, so testing for optimism/pessimism bias is definitely possible. However, only 2% of predictions have a negative framing, so our sample size is too low to say anything conclusive about this yet.
Enriching our database with base rates and categories would be fantastic, but my hunch is that given the nature and phrasing of our questions this would be impossible to do at scale. I’m much more bullish on per-predictor analyses and that’s more or less what we’re doing with the individual dashboards.

Javier Prieto🔸

Take­aways from the Me­tac­u­lus AI Progress Tournament

A Cri­tique of The Precipice: Chap­ter 6 - The Risk Land­scape [Red Team Challenge]

Takeaways from the Metaculus AI Progress Tournament

A Critique of The Precipice: Chapter 6 - The Risk Landscape [Red Team Challenge]