nikos

Karma: 647

[LW xpost] Unit economics of LLM APIs

dschwarzAug 27, 2024, 4:55 PM

19 points

2 comments1 min readEA link

(www.lesswrong.com)

A breakdown of OpenAI’s revenue

dschwarzJul 10, 2024, 6:07 PM

58 points

8 comments1 min readEA link

Mirror, Mirror on the Wall: How Do Forecasters Fare by Their Own Call?

nikosNov 7, 2023, 5:37 PM

20 points

0 comments14 min readEA link

nikos Oct 20, 2023, 1:42 PM
1 point
0 ∶ 0
in reply to: Dan_Keys’s comment on: Comparing Two Forecasters in an Ideal World
Good comment, thank you!

nikos Oct 11, 2023, 12:11 PM
2 points
0 ∶ 0
in reply to: JoshuaBlake’s comment on: Comparing Two Forecasters in an Ideal World
Can’t think of anything better than a t-test, but open for suggestions.
If a forecaster is consistently off by like 10 percentage points—I think that is a difference that matters. But even in that extreme scenario where the (simulated) difference between two forecasters is in fact quite large, we have a hard time picking that up using standard significance tests.

Comparing Two Forecasters in an Ideal World

nikosOct 9, 2023, 8:06 PM

16 points

6 comments6 min readEA link

Analysing Individual Contributions to the Metaculus Community Prediction

nikosMay 8, 2023, 10:58 PM

28 points

1 comment12 min readEA link

nikos Apr 7, 2023, 8:51 PM
3 points
1 ∶ 0
in reply to: Vasco Grilo🔸’s comment on: Wisdom of the Crowd vs. “the Best of the Best of the Best”
Interesting, thanks for sharing the paper. Yeah agree that using the Brier score / log score might change results and it would definitely be good to check that as well.

nikos Apr 7, 2023, 8:49 PM
1 point
0 ∶ 0
in reply to: isabel’s comment on: Wisdom of the Crowd vs. “the Best of the Best of the Best”
In principle yes. In practice also usually yes, but the specifics depend on whether the average user who predicted on a question gets a positive amount of points. So if you predicted very late and your points are close to zero, but the mean number of points forecasters on that question received is positive, then you will end up with a negative update to your reputation score.
Completely agree that a lot hinges on that reputation score. It seems to work decent for the Metaculus Prediction, but it would be good to see what results look like for a different metric of past performance.

nikos Apr 4, 2023, 8:33 PM
3 points
0 ∶ 0
in reply to: Charles Dillon 🔸’s comment on: Wisdom of the Crowd vs. “the Best of the Best of the Best”
Not sure how to quantify that (open for ideas). But intuitively I agree with you and would suspect it’s at least a sizable part

nikos Apr 4, 2023, 6:59 PM
2 points
0 ∶ 0
in reply to: NunoSempere’s comment on: Wisdom of the Crowd vs. “the Best of the Best of the Best”
Yeah, definitely. The title was a bit tongue-in-cheek (it’s a movie quote)

Wisdom of the Crowd vs. “the Best of the Best of the Best”

nikosApr 4, 2023, 3:32 PM

101 points

11 comments12 min readEA link

nikos Mar 5, 2023, 2:17 PM
2 points
0 ∶ 0
in reply to: David Glidden’s comment on: Predictive Performance on Metaculus vs. Manifold Markets
And is the code to the MetaculusBot public somewhere? :)

nikos Mar 5, 2023, 1:10 PM
3 points
1 ∶ 0
in reply to: David Glidden’s comment on: Predictive Performance on Metaculus vs. Manifold Markets
It should be possible to fully automate the bot and just run a CRON job that regularly checks the Metaculus API for new questions, right?

nikos Mar 5, 2023, 1:08 PM
3 points
0 ∶ 0
in reply to: Scott Alexander’s comment on: Predictive Performance on Metaculus vs. Manifold Markets
I slightly tend towards yes, but that’s mere intuition. As someone on Twitter put it, “Metaculus has a more hardcore user base, because it’s less fun”—I find it plausible that the Metaculus user base and the Manifold user base differs. But higher trading volume I think would have helped.
For this particular analysis I’m not sure correcting for the number of forecasters would really be possible in a sound way. It would be great to get the MetaculusBot more active again to collect more data.

nikos Mar 4, 2023, 4:02 PM
5 points
0 ∶ 0
on: Predictive Performance on Metaculus vs. Manifold Markets
Is it possible to get rid of the question mode for this post?

[Question] Predictive Performance on Metaculus vs. Manifold Markets

nikosMar 3, 2023, 7:39 PM

111 points

8 comments5 min readEA link

nikos Feb 7, 2023, 11:16 AM
1 point
0 ∶ 0
in reply to: Michael_Wiebe’s comment on: More Is Probably More—Forecasting Accuracy and Number of Forecasters on Metaculus
For Metaculus there are lots of ways to drive engagement: prioritise making the platform easier to use, increase cash prizes, community building and outreach etc.
But as mentioned in the article the problem in practice is that the bootstrap answer is probably misleading, as increasing the number of forecasters likely changes forecaster composition.
However, one specific example where the analysis might be actually applicable is when you’re thinking about how many Pro Forecasters you hire for a job.

nikos Feb 1, 2023, 8:00 AM
1 point
0 ∶ 0
in reply to: Otis Reid’s comment on: More Is Probably More—Forecasting Accuracy and Number of Forecasters on Metaculus
In principle yes, you’ll just still always have the problem that people are predicting at different time points. If the best and the 2nd best predict weeks or months apart then that changes results.

nikos Feb 1, 2023, 7:48 AM
3 points
0 ∶ 0
in reply to: Michael_Wiebe’s comment on: More Is Probably More—Forecasting Accuracy and Number of Forecasters on Metaculus
Ah snap! I forgot to remove that paragraph… I did subsampling initially, then switched to bootstrapipng. Resulsts remained virtually unchanged. Thanks for pointing that out, will update the text.

nikos

[LW xpost] Unit eco­nomics of LLM APIs

A break­down of OpenAI’s revenue

Mir­ror, Mir­ror on the Wall: How Do Fore­cast­ers Fare by Their Own Call?

Com­par­ing Two Fore­cast­ers in an Ideal World

Analysing In­di­vi­d­ual Con­tri­bu­tions to the Me­tac­u­lus Com­mu­nity Prediction

Wis­dom of the Crowd vs. “the Best of the Best of the Best”

[Question] Pre­dic­tive Perfor­mance on Me­tac­u­lus vs. Man­i­fold Markets