On 1, with your permission, Iâd ask if I could share a screenshot of me asking you in DMs, directly, for viewer minutes. You gave me views, and thus I multiplied the average TikTok length and by a factor for % watched.
On A, yes, the FLI Podcast was perhaps the data point I did the most estimating for a variety of reasons I explained before.
On B, I think you can, in fact, find which are and arenât estimates though I do understand how itâs not clear. We considered ways of doing this without being messy. Ill try to make it more clear.
On C, how much you pay for a view is not a constant though. It depends a lot on organic views. And I think boosting videos is a sensible strategy since you put $ into both production costs (time, equipment, etc.) and advertisement. FIguring out how to spend that money efficiently is important.
On 3, many other people were mentioned. In fact, I found a couple of creators this way. But yes, it was extremely striking and thus suggested that this was a very important factor in the analysis. I want to stress that I do in fact, think that this matters a lot. When Austin and I were speaking and relying on comparisons, we thought his quality numbers should be much higher in fact, we toned it down though maybe we shouldnât have.
To give clarity, I didnât seek people out who worked in AI safety. Hereâs what I did to the best of my recollection.
Over the course of 3 days, I asked anyone I saw in Mox who seemed friendly enough, as well as Taco Tuesday, and sent a few DMs to acquaintances. The DMs I sent were to people who work in AI safety, but there were only 4. So ~46 came from people hanging out around Mox and Taco Tuesday.
I will grant that this lends to an SF/âAI safety bias. Now, Rob Milesâ audience comes heavily from Computerphile and such whose audience is largely young people interested in STEM who like to grapple with interesting academic-y problems in their spare time (outside of school). In other words, this is an audience that we care a lot about reaching. Itâs hard to overstate the possible variance in audience âqualityâ. For example, Jane Street pays millions to advertisers to get itself seen in front of potential traders on channels like Stand-Up Maths or the Dwarkesh podcast. These channels donât actually get that many views compared to others but they have a very high âaudience qualityâ, clearly, based on how much trading firms are willing to pay to advertise there. We actually thought a decent, though imperfect, metric for audience quality would just be a personâs income compared to the world average of ~12k. This meant the average american would have an audience quality of 7. Austin and I thought this might be a bit too controversial and doesnât capture exaxctly what we mean (we care about attracking a poor MIT CS student more than a mid-level real estate developer in Miami) but itâs a decent approximation.
Audience quality is roughly something like âthe people we care most about reaching,â and thus âpeople who can go into work on technical AI safetyâ seems very important.
Rob wasnât the only one mentioned, the next most popular were Cognitive Revolution and AI in context (people often said âAricâ) since I asked them to just name anyone they listen to/âwould consider an AI safety youtuber, etc.
On 4, I greatly encourage people to input their own weights, I specifically put that in the doc and part of the reason for doing this project was to get people to talk about cost effectiveness in AI safety.
On my bias: Like all human beings, Iâm flawed and have biases, but I did my best to just objectively look at data in what I thought the best way possible. I appreciate that you talked to others regarding my intentions.
Iâll happily link to my comments on Manifund 123 you may be referring to for people to see the full comments and perhaps emphasize some points I wrote
@ I want to quickly note that itâs a bit unfair for me to specifically only call you out on this or rather, that this is a thing I find with many AI safety projects. It just came up high on Manifund when I logged on for other reasons and saw donations from people I respect.
FWIW, I donât want to single you out, I have this kind of critique of many, many people doing AI safety work but this just seems like a striking example of it.
I didnât mean my comments to say âyou should return this moneyâ. Lots of grants/âspending in EA ecosystems I consider to be wasteful, ineffective etc. And again, apologies for singling you out on a gripe I have with EA funding.
Many people can tell you that I have a problem with the free-spending, lavish and often wasteful spending in the longtermist side of EA. I think I made it pretty clear that I was using this RFP as an example because other regrantors gave to it.
This project with Austin was planned to happen before you posted your RFP on Manifund (I can provide proof if youâd like).
I wasnât playing around with the weights to make you come out lower. I assure you, my bias is usually against projects I perceive to be âfree-spendingâ.
I think itâs good/ânatural to try to create separation between evaluators/âprojects though.
For context, you asked me for data for something you were planning (at the time) to publish day-off. Thereâs no way to get the watchtime easily on TikTok (which is why I had to do manual addition of things on a computer) and I was not on my laptop, so couldnât do it when you messaged me. You didnât follow up to clarify that watchtime was actually the key metric in your system and you actually needed that number.
Good to know that the 50 people were 4 Safety people and 46 people who hang at Mox and Taco Tuesday. I understand youâre trying to reach the MIT-graduate working in AI who might somehow transition to AI Safety work at a lab /â constellation. I know that Dwarkesh & Nathan are quite popular with that crowd, and I have a lot of respect for what Aric (& co) did, so the data you collected make a lot of sense to me. I think I can start to understand why you gave a lower score to Rational Animations or other stuff like AIRN.
Iâm now modeling you as trying to answer something like âhow do we cost-effectively feed AI Safety ideas to the kind of people who walk in at Taco Tuesday, who have the potential to be good AI Safety researchersâ. Given that, I can now understand better how you ended up giving some higher score to Cognitive Revolution and Robert Miles.
On 1, with your permission, Iâd ask if I could share a screenshot of me asking you in DMs, directly, for viewer minutes. You gave me views, and thus I multiplied the average TikTok length and by a factor for % watched.
On A, yes, the FLI Podcast was perhaps the data point I did the most estimating for a variety of reasons I explained before.
On B, I think you can, in fact, find which are and arenât estimates though I do understand how itâs not clear. We considered ways of doing this without being messy. Ill try to make it more clear.
On C, how much you pay for a view is not a constant though. It depends a lot on organic views. And I think boosting videos is a sensible strategy since you put $ into both production costs (time, equipment, etc.) and advertisement. FIguring out how to spend that money efficiently is important.
On 3, many other people were mentioned. In fact, I found a couple of creators this way. But yes, it was extremely striking and thus suggested that this was a very important factor in the analysis. I want to stress that I do in fact, think that this matters a lot. When Austin and I were speaking and relying on comparisons, we thought his quality numbers should be much higher in fact, we toned it down though maybe we shouldnât have.
To give clarity, I didnât seek people out who worked in AI safety. Hereâs what I did to the best of my recollection.
Over the course of 3 days, I asked anyone I saw in Mox who seemed friendly enough, as well as Taco Tuesday, and sent a few DMs to acquaintances. The DMs I sent were to people who work in AI safety, but there were only 4. So ~46 came from people hanging out around Mox and Taco Tuesday.
I will grant that this lends to an SF/âAI safety bias. Now, Rob Milesâ audience comes heavily from Computerphile and such whose audience is largely young people interested in STEM who like to grapple with interesting academic-y problems in their spare time (outside of school). In other words, this is an audience that we care a lot about reaching. Itâs hard to overstate the possible variance in audience âqualityâ. For example, Jane Street pays millions to advertisers to get itself seen in front of potential traders on channels like Stand-Up Maths or the Dwarkesh podcast. These channels donât actually get that many views compared to others but they have a very high âaudience qualityâ, clearly, based on how much trading firms are willing to pay to advertise there. We actually thought a decent, though imperfect, metric for audience quality would just be a personâs income compared to the world average of ~12k. This meant the average american would have an audience quality of 7. Austin and I thought this might be a bit too controversial and doesnât capture exaxctly what we mean (we care about attracking a poor MIT CS student more than a mid-level real estate developer in Miami) but itâs a decent approximation.
Audience quality is roughly something like âthe people we care most about reaching,â and thus âpeople who can go into work on technical AI safetyâ seems very important.
Rob wasnât the only one mentioned, the next most popular were Cognitive Revolution and AI in context (people often said âAricâ) since I asked them to just name anyone they listen to/âwould consider an AI safety youtuber, etc.
On 4, I greatly encourage people to input their own weights, I specifically put that in the doc and part of the reason for doing this project was to get people to talk about cost effectiveness in AI safety.
On my bias:
Like all human beings, Iâm flawed and have biases, but I did my best to just objectively look at data in what I thought the best way possible. I appreciate that you talked to others regarding my intentions.
Iâll happily link to my comments on Manifund 1 2 3 you may be referring to for people to see the full comments and perhaps emphasize some points I wrote
Many people can tell you that I have a problem with the free-spending, lavish and often wasteful spending in the longtermist side of EA. I think I made it pretty clear that I was using this RFP as an example because other regrantors gave to it.
This project with Austin was planned to happen before you posted your RFP on Manifund (I can provide proof if youâd like).
I wasnât playing around with the weights to make you come out lower. I assure you, my bias is usually against projects I perceive to be âfree-spendingâ.
I think itâs good/ânatural to try to create separation between evaluators/âprojects though.
For context, you asked me for data for something you were planning (at the time) to publish day-off. Thereâs no way to get the watchtime easily on TikTok (which is why I had to do manual addition of things on a computer) and I was not on my laptop, so couldnât do it when you messaged me. You didnât follow up to clarify that watchtime was actually the key metric in your system and you actually needed that number.
Good to know that the 50 people were 4 Safety people and 46 people who hang at Mox and Taco Tuesday. I understand youâre trying to reach the MIT-graduate working in AI who might somehow transition to AI Safety work at a lab /â constellation. I know that Dwarkesh & Nathan are quite popular with that crowd, and I have a lot of respect for what Aric (& co) did, so the data you collected make a lot of sense to me. I think I can start to understand why you gave a lower score to Rational Animations or other stuff like AIRN.
Iâm now modeling you as trying to answer something like âhow do we cost-effectively feed AI Safety ideas to the kind of people who walk in at Taco Tuesday, who have the potential to be good AI Safety researchersâ. Given that, I can now understand better how you ended up giving some higher score to Cognitive Revolution and Robert Miles.