CS, AIS, PoliSci @ UC Chile.
Milan Weibelš¹
Left-progressive online people seem to be consolidating on an anti-AI position; but mostly derived from resistance to the presumed economic impacts from AI art, badness-by-association inherited from the big tech /ā tech billionaires /ā ātechbroā cluster, and on the academic side from concern about algorithmic bias and the like. However, they seem to be failing at extrapolation. āAI badā gets misgeneralized into skepticism about current and future AI capabilities.
Left-marxist people seem to be thinking a bit more clearly about this (ie extrapolating, applying any economic model at all, looking a bit into the tech). See an example here, or a summary here. However, the labs are based in the US, a country where associating with marxists is a very bad idea if you want your policies to get implemented.
These two leftist stances are mostly orthogonal to concerns about AI x-risk and catastrophic misuse. However, a lot of activists believe that the publicās attention is zero-sum. I suspect that is the main reason coalition-building with the preceding two groups has not happened much. However, I think it is still possible.
About the American right: some actors have largely succeeded in marrying China-hawkism with AI-boosterism. I expect this association to be very sticky, but it may be counteracted by reactionary impulses coming from spooked cultural conservatives.
Weird off-the-cuff question but maybe intentionally inducing something like experimenter demand effects would be a worthwhile intervention? After figuring out a way of not making recipients feel cheated or patronized, of course.
Probably not, or to a much lesser extent.
I would expect those to be the same person if AI turns out to not be a huge deal, which for me is about 25% of futures.
While I agree that strong founder effects are likely to apply if SpaceX and/āor NASA succeed in establishing a Mars colony, I expect that colony to be Earth-dependent for decades, and to be quite vulnerable to being superseded by other actors.
To put my model in more concrete terms: I expect whoever controls cislunar space in 2050 to have more potential for causal influence over the state of Mars in 2100 than whoever has put more people on Mars by 2040.
I think it would be a major win for animal welfare if the plant-based foods industry could transition soy-based products to low-isoflavone and execute a successful marketing campaign to quell concerns about phytoestrogens (without denigrating higher-isoflavone soy products).
I think it would be really hard (maybe even practically impossible) to market isoflavone-reduced products without hurting demand for non-isoflavone-reduced products as a side effect.
If the plant-based food industry started producing and marketing isoflavone-reduced soy products, I am quite confident that it would counterfactually lower total demand for soy products in the short term, and I am very uncertain about the sign of impact over the long term.
Hi Juliana! Thank you for your response, it indeed answers my question quite clearly.
Short reĀview of our TenĀsorTrust-based AI safety uniĀverĀsity outĀreach event
I love how thorough this post is. However, Iām not sure why you chose to research the production of vitamin D in an ASRS over other nutrients Pham et al. 2022 found would be deficient given adequate ASRS responses, such as vitamins E and K. ĀæAre the effects of vitamin D deficiency worse, or maybe it is more feasible to produce than vitamins E and K?
However, endorsing this view likely requires fairly speculative claims about how existing risks will nearly disappear after the time of perils has ended.
A note on this: the first people to study the notion of existential risk in an academic setting (Bostrom, Ord, Sandberg, etc.) explicitly state in their work many of those assumptions.
They chiefly revolve around the eventual creation of advanced AI which would enable the automation of both surveillance and manufacturing; the industrialization of outer space, and eventually the interstellar expansion of Earth-originated civilization.
In other words: they assume that bothThe creation of safe AGI is feasible.
Extremely robust existential security will follow, conditional on (1.).
Proposed mechanisms for (2.) include interstellar expansion and automated surveillance.
Thus, the main crux on the value of working on longtermist interventions is the validity of assumptions (1.) and (2.). In my opinion, finding out how likely they are to be true or not is very important and quite neglected. I think that scrutinizing (2.). is both more neglected and more tractable than examining (1.), and I would love to see more work on it.
Why not print the pdf?
I think it is very likely that the top American AI labs are receiving substantial help from the NSA et al in implementing their āadministrative, technical, and physical cybersecurity protectionsā. No need to introduce Crowdstrike as a vulnerability.
The labs get fined if they donāt implement such protections, not if they get compromised.
Humans could use AI propaganda tools against other humans. Autonomous AI actors may have access to better or worse AI propaganda capabilities than those used by human actors, depending on the concrete scenario.
I guess this somewhat depends on how good you expect AI-augmented persuasion/āpropaganda to be. Some have speculated it could be extremely effective. Others are skeptical. Totalitarian regimes provide an existence proof of the feasibility of controlling populations on the medium term using a combination of pervasive propaganda and violence.
Contra hard moral anti-realism: a rough sequence of claims
Epistemic and provenance note: This post should not be taken as an attempt at a complete refutation of moral anti-realism, but rather as a set of observations and intuitions that may or may not give one pause as to the wisdom of taking a hard moral anti-realist stance. I may clean it up to construct a more formal argument in the future. I wrote it on a whim as a Telegram message, in direct response to the claim
> āyou canāt find āvaluesā in realityā.
Yet, you can find valence in your own experiences (that is, you just know from direct experience whether you like the sensations you are experiencing or not), and you can assume other people are likely to have a similar enough stimulus-valence mapping. (Example: Iām willing to bet 2k USD on my part against a single dollar yours that that if I waterboard you, youāll want to stop before 3 minutes have passed.)[1]
However, since we humans are bounded imperfect rationalists, trying to explicitly optimize valence is often a dumb strategy. Evolution has made us not into fitness-maximizers, nor valence-maximizers, but adaptation-executers.
āvaluesā originate as (thus are) reifications of heuristics that reliably increase long term valence in the real world (subject to memetic selection pressures, among them social desirability of utterances, adaptativeness of behavioral effects, etc.)
If you find yourself terminally valuing something that is not someoneās experienced valence, then either one of these propositions is likely true:A nonsentient process has at some point had write access to your values.
What you value is a means to improving somebodyās experienced valence, and so are you now.
crossposted from lesswrong
- ^
In retrospect, making this proposition was a bit crass on my part.
In a certain sense, an LLMās token embedding matrix is a machine ontology. Semantically similar tokens have similar embeddings in the latent space. However, different models may have learned different associations when their embedding matrix was trained. Every forward pass starts colored by ontological assumptions, an these may have alignment implications.
For instance, we would presumably not want a model to operate within an ontology that associates the concept of AI with the concept of evil, particularly if it is then prompted to instantiate a simulacrum that believes it is an AI.
Has someone looked into this? That is, the alignment implications of different token embedding matrices? I feel like it would involve calculating a lot of cosine similarities and doing some evals.
Milan Weibelās Quick takes
Intriguing. Looking forward to the live demo.
PSA: The form accepts a maximum of 10 files, that is, 5 design proposals maximum (because each proposal requires uploading both a .png and a .svg file).
Thereās some of this: see this Gwern post for the classic argument.
LLMs seem by default less agentic than the previous end-to-end RL paradigm. Maybe the rise of LLMs was an exercise in deliberate differential technological development. Iām not sure about this, it is personal speculation.