“AI: Forecasters on the community forecasting platform Metaculus think that artificial intelligent systems that are better than humans at all relevant tasks will be created in 2042.”
How do you get this from the questions’ operationalization?
axioman
Is there a description of the desk-reject policy and/or statistics on how many applications were desk rejected?
It does not really seem to address the reasoning from my second paragraph. You say:
“Similarly, if people refused to consume any goods or services that were associated with net-positive greenhouse gas emissions, then those industries would rapidly decarbonize or go out of business.”,
but it seems to me that this would be way more costly for individuals than giving up on meat, in addition to leading to way larger economic damage in the short to medium term (without enough time for investments into replacement technologies to pay off).
There seems to be a clear disanalogy in that if every individual stopped eating meat tomorrow, factory farming would be history very quickly. On the other hand, if everyone tried very hard to reduce their personal CO2 consumption, the effect seems more limited (unless people are really serious about it, in which case this would probably lead to major economic damage).
The key difference seems to be that CO2 emissions are embedded in our current societies and economies in such a deep way, that we can only get out via long-term investment into replacement infrastructure (such as renewables, electric cars, public transportation etc.), which is not necessarily influenced that strongly by individual consumption. On the other hand, meat eating is exclusively about the sum of personal demand, even though measures to reduce supply via policy or decrease demand via investment into viable substitutes would still be highly valuable. (I imagine that I might change my mind if this line of thought was convincingly refuted).
While I also disagree with the top level post, this seems overly hostile.
Thank you!
5% does sound very alarming to me, and is definitely a lot higher than I would have said at the beginning of the crisis (without having thought about it much, then).
Also, beyond the purely personal, are there any actions that could be taken by individuals right now that would have a positive impact on humanity’s chances to recover, conditional on nuclear war?
Some (probably naive) ideas:Downloading and printing vital information about how to rebuild food supply and other vital infrastructure (so that it can easily be accessed despite varying degrees of infrastructure collapse); I guess ALLFED’s articles might be a good starting point (even though I could not quickly find any distilled strategy document/user guide for their research)?
Increase the likelihood that you will be able to distribute this information: Ensure survival, build local networks, practice leadership skills etc.
Make sure others do the same: While there already seem to be a lot of preppers, I do not know whether their culture emphasizes strategies for rebuilding a flourishing civilization over mere survival. “Altruistic prepping” might be a relatively neglected niche (or not, these are just off the cuff thoughts...)
Putin seems to have ordered deterrence forces (which include nuclear arms) to be on high alert, roughly an hour ago. https://www.reuters.com/world/europe/biden-says-russian-attack-ukraine-unfolding-largely-predicted-2022-02-24/
Can someone weigh in about how unprecedented this is? Some media coverage has compared the severity of the current situation to the Cuba Crisis, which would be extremely alarming if remotely true.
Miscalibration might cut both ways…
On one hand, It seems quite plausible for forecasts like this to usually be underconfident about the likelihood of the null event, but on the other hand recent events should probably have substantially increased forecasters’ entropy for questions around geopolitical events in the next few days and weeks.
(This risk is a greater risk than the risk of >50% of the public advocating for unethical policies out of self-interest, because in expectation, unethical policies in the self-interest of “>50% of the public” would be good for more people than unethical policies in the self-interest of experts)
This seems to have a bunch of hidden assumptions, including both about the relative capabilities of experts vs. the public to assess the effects of policies, as well as about the distribution of potential policies: While constitutions are not really a technocratic constraint on public opinion, one of their functions appears to be to protect minorities from a majority blatantly using policies to suppress them; in a world where the argument fully went through, this function would not be necessary.
The fact that ‘technocracy’ gets named so infrequently by EAs may be a sign that many are advocating for more technocracy without realising it or without realising that the term exists, along with pre-existing criticism of the idea.
While this might certainly be true, the negative connotations of the term “technocracy” might play an important role here as well: Someone who is aware of the concepts and its criticisms might nevertheless be prompted not to use the term in order to avoid knee-jerk reactions, similar to how someone arguing for more “populist” positions might not use that term, depending on the audience.
While I am not sure I agree about the strong language regarding urgent priorities, and would also like to find more neutral terms for both sides, I agree that a better understanding of the balance between expert-driven policy and public opinion would be quite useful; I could imagine that which one is better can strongly depend on specific details of a particular policy problem, and that there might be ways of integrating parts of both sides productively: While I do think that Futarchy is unlikely to work, some form of “voting on values” and relying on expertise for predicting how policies would affect values still appears appealing, especially if experts’ incentives can be designed to clearly favor prediction accuracy, while avoiding issues with self-fulfilling prophecies.
Do you have thoughts on how potentially rising inflation could affect emission pathways and the relative cost of renweables? I have heard the argument that associated rises in the cost of capital could be pretty bad, because most costs associated with renewables are capital costs, while fuel costs dominate for fossil energy.
Huh? I did not like the double-page style for the non-mobile pdf, as it required some manual rescaling on my PC.
And the mobile version has the main table cut between two pages in a pretty horrible way. I think I would have much preferred a single pdf in the mobile/single page style that is actually optimized for that style, rather than this.
Maybe I should have used the HTML version instead?
More detailed action points on safety from page 32:
The Office for AI will coordinate cross-government processes to accurately assess long term AI safety and risks, which will include activities such as evaluating technical expertise in government and the value of research infrastructure. Given the speed at which AI developments are impacting our world, it is also critical that the government takes a more precise and timely approach to monitoring progress on AI, and the government will work to do so.
The government will support the safe and ethical development of these technologies as well as using powers through the National Security & Investment Act to mitigate risks arising from a small number of potentially concerning actors. At a strategic level, the National Resilience Strategy will review our approach to emerging technologies; the Ministry of Defence will set out the details of the approaches by which Defence AI is developed and used; the National AI R&I Programme’s emphasis on AI theory will support safety; and central government will work with the national security apparatus to consider narrow and more general AI as a top-level security issue.
I don’t think I get your argument for why the approximation should not depend on the downstream task. Could you elaborate?
I am also a bit confused about the relationship between spread and resiliency: a larger spread of forecasts does not seem to necessarily imply weaker evidence: It seems like for a relatively rare event about which some forecasters could acquire insider information, a large spread might give you stronger evidence.Imagine is about the future enactment of a quite unusual government policy, and one of your forecasters is a high ranking government official. Then, if all of your forecasters are relatively well calibrated and have sufficient incentive to report their true beliefs, a 90% forecast for by the government official and a 1% forecast by everyone else should likely shift your beliefs a lot more towards than a 10% forecast by everyone.
This seems to connect to the concept of—means: If the utility for an option is proportional to , then the expected utility of your mixture model is equal to the expected utility using the -mean of the expert’s probabilities and defined as , as the in the utility calculation cancels out the . If I recall correctly, all aggregation functions that fulfill some technical conditions on a generalized mean can be written as a -mean.
In the first example, is just linear, such that the -mean is the arithmetic mean. In the second example, is equal to the expected lifespan of which yields the harmonic mean. As such, the geometric mean would correspond to the mixture model if and only if utility was logarithmic in , as the geometric mean is the -mean corresponding to the logarithm.For a binary event with “true” probability , the expected log-score for a forecast of is , which equals for . So the geometric mean of odds would
optimizeyield the correct utility for the log-score according to the mixture model, if all the events we forecast were essentially coin tosses (which seems like a less satisfying synthesis than I hoped for).Further questions that might be interesting to analyze from this point of view:
Is there some kind of approximate connection between the Brier score and the geometric mean of odds that could explain the empirical performance of the geometric mean on the Brier score? (There might very well not be anything, as the mixture model might not be the best way to think about aggregation).
What optimization target (under the mixture model) does extremization correspond to? Edit: As extremization is applied after the aggregation, it cannot be interpreted in terms of mixture models (if all forecasters give the same prediction, any -mean has to have that value, but extremization yields a more extreme prediction.)
Note: After writing this, I noticed that UnexpectedValue’s comment on the top-level post essentially points to the same concept. I decided to still post this, as it seems more accessible than their technical paper while (probably) capturing the key insight.
Edit: Replaced “optimize” by “yield the correct utility for” in the third paragraph.
I wanted to flag that many PhD programs in Europe might require you to have a Master’s degree, or to essentially complete the coursework for Master’s degree during your PhD (as seems to be the case in the US), depending on the kind of undergraduate degree you hold. Obviously, the arguments regarding funding might still partially hold in that case.
Do you have a specific definition of AI Safety in mind? From my (biased) point of view, it looks like large fractions of work that is explicitly branded “AI Safety” is done by people who are at least somewhat adjacent to the EA community. But this becomes a lot less true if you widen the definition to include all work that could be called “AI Safety” (so anything that could conceivably help with avoiding any kind of dangerous malfunction of AI systems, including small scale and easily fixable problems).
Relatedly, what is the likelihood that future iterations of the fellowship might be less US-centric, or include Visa sponsorship?
The job posting states:
“All participants must be eligible to work in the United States and willing to live in Washington, DC, for the duration of their fellowship. We are not able to sponsor US employment visas for participants; US permanent residents (green card holders) are eligible to apply, but fellows who are not US citizens may be ineligible for placements that require a security clearance.”
So my impression would be that it would be pretty difficult to participate for non-US citizens who do not already live in the US.
I think this strongly depends on how much weight you expect forcasters on metaculus to put onto the actual operationalization rather than the question’s “vibe”. I personally expect quite a bit of weight on the exact operationalization, so I am generally not very happy with how people have been talking about this specific forecast (the term “AGI” often seems to invoke associations that are not backed by the forecast’s operationalization), and would prefer a more nuanced statement in the report.
(Note, that you might believe that the gap between the resolution criteria of the question and more colloqiual interpretations of “AGI” is very small, but this would seem to require an additional argument on top of the metaculus forecast).