Milan Weibel🔹

Karma: 155

CS, AIS, PoliSci @ UC Chile.

Milan Weibel🔹Aug 21, 2024, 4:24 PM
6 points
0 ∶ 0
on: Milan Weibel’s Quick takes
Contra hard moral anti-realism: a rough sequence of claims
Epistemic and provenance note: This post should not be taken as an attempt at a complete refutation of moral anti-realism, but rather as a set of observations and intuitions that may or may not give one pause as to the wisdom of taking a hard moral anti-realist stance. I may clean it up to construct a more formal argument in the future. I wrote it on a whim as a Telegram message, in direct response to the claim

> “you can’t find “values” in reality”.

Yet, you can find valence in your own experiences (that is, you just know from direct experience whether you like the sensations you are experiencing or not), and you can assume other people are likely to have a similar enough stimulus-valence mapping. (Example: I’m willing to bet 2k USD on my part against a single dollar yours that that if I waterboard you, you’ll want to stop before 3 minutes have passed.)^[1]

However, since we humans are bounded imperfect rationalists, trying to explicitly optimize valence is often a dumb strategy. Evolution has made us not into fitness-maximizers, nor valence-maximizers, but adaptation-executers.

”values” originate as (thus are) reifications of heuristics that reliably increase long term valence in the real world (subject to memetic selection pressures, among them social desirability of utterances, adaptativeness of behavioral effects, etc.)

If you find yourself terminally valuing something that is not someone’s experienced valence, then either one of these propositions is likely true:
- A nonsentient process has at some point had write access to your values.
- What you value is a means to improving somebody’s experienced valence, and so are you now.
crossposted from lesswrong
1. ^
  In retrospect, making this proposition was a bit crass on my part.

Milan Weibel🔹Nov 16, 2023, 10:05 PM
4 points
0 ∶ 0
on: Milan Weibel’s Quick takes
In a certain sense, an LLM’s token embedding matrix is a machine ontology. Semantically similar tokens have similar embeddings in the latent space. However, different models may have learned different associations when their embedding matrix was trained. Every forward pass starts colored by ontological assumptions, an these may have alignment implications.

For instance, we would presumably not want a model to operate within an ontology that associates the concept of AI with the concept of evil, particularly if it is then prompted to instantiate a simulacrum that believes it is an AI.

Has someone looked into this? That is, the alignment implications of different token embedding matrices? I feel like it would involve calculating a lot of cosine similarities and doing some evals.

Milan Weibel’s Quick takes

Milan Weibel🔹Nov 16, 2023, 10:05 PM

3 points

6 comments EA link

Milan Weibel🔹May 23, 2023, 1:50 PM
1 point
0 ∶ 0
on: How I learned to stop worrying and love skill trees
Intriguing. Looking forward to the live demo.

Milan Weibel🔹May 15, 2023, 3:16 PM
2 points
1 ∶ 0
on: AI safety logo design contest, due mid-May
PSA: The form accepts a maximum of 10 files, that is, 5 design proposals maximum (because each proposal requires uploading both a .png and a .svg file).

Milan Weibel🔹May 15, 2023, 3:13 PM
1 point
0 ∶ 0
in reply to: Milan Weibel🔹’s comment on: EA and AI Safety Schism: AGI, the last tech humans will (soon*) build
Just for the sake of clarity: I think the word “schism” is inaccurate here because it carries false connotations of conflict.

Milan Weibel🔹May 15, 2023, 3:52 AM
7 points
0 ∶ 0
on: Asking for online calls on AI s-risks discussions
Hi Jack!
Have you considered booking a call with 8000 hours career advising? They can help you analyse the factors behind your plans about your future career, and put you in contact with people working in the areas that interest you.
You could also contact CLR and CRS. If you show knowledge of and interest in their work, they may be eager to help. You can’t be sure if you’ll get a reply, and that may seem intimidating, but remember that the cost is minimal, EV is high, and how you feel about not getting a reply is at least partly under your control.
While this last point is not specifically focused on s-risks, a very cheap, very valuable, action you can take is subscribing to the AI Safety opportunities update emails at AI Safety Training. Many hackathons advertised there are beginner-friendly.

Milan Weibel🔹May 15, 2023, 3:09 AM
9 points
3 ∶ 0
on: EA and AI Safety Schism: AGI, the last tech humans will (soon*) build
Side note: calling a world modelling disagreement implied by differences in cause prioritisation a “schism” is in my opinion unwarranted and (low-probability, very negative value) risks becoming a self-fulfilling prophecy.

Milan Weibel🔹May 15, 2023, 3:06 AM
2 points
1 ∶ 0
on: EA and AI Safety Schism: AGI, the last tech humans will (soon*) build
A more pessimistic counterargument: Safely developing AGI is so hard as to be practically impossible. I do not believe this one, but some pessimistic sectors within AIS do. It combines well with the last counterargument you list (that the timelines where things turn out OK are all ones where we stop / radically slow down the development of AI capabilities). If you are confident that aligning AGI is for all practical purposes impossible, then you focus on preventing the creation of AGI and on improving the future of the timelines where AGI has been successfully avoided.

Milan Weibel🔹May 11, 2023, 2:34 AM
4 points
0 ∶ 0
on: RIP Medical Debt
EDIT: Other commenters have pointed out reasons why the elimination of debt sold really cheap is unlikely to affect much the lives of recipients. Still, if the debt relieved did in fact significantly help the beneficiaries, it could turn out to be very effective. However, we won’t know until RIP releases recipient outcomes data.
TL;DR: About as cost-effective as GiveWells’s top charities, IF my assumption about outcomes is broadly right. $14.16 to provide debt relief to one person. If one assumes a lifespan increase of 0.2% (less than two months) as the effect (by preventing healthcare avoidance), it comes out to $7080 per death-equivalent-in-lifespan averted. I recommend looking further into it, particularly with respect to outcomes.
Hi Layla, welcome to the Forum! Thanks for posting!
This looks like an interesting opportunity. Within the cause area of health in the US, RIP seems to have chosen a big and tractable problem, and to be triaging their beneficiaries according to the relevant metrics.
Here is my attempt to have a rough idea about RIP’s cost-effectiveness.
RIP claims that it has “helped 5,492,948 individuals and families” and has relieved $8,520,147,644 of medical debt. The average debt relieved per recipient is thus $8,520,147,644 / 5,492,948 = $1551. If, as you say, “every $100 donated clears $10,000 in medical debt”, then the cost per recipient is $15.51 (!!!).
I was initially skeptical of this calculation, but it checks out. In its 2021 year end report, RIP says that it relieved debt to 1,312,697 people during the year, and in its 2021 financial statement declares total expenses of $18,587,272. So the cost per recipient is $18,587,272 / 1,312,697 = $14.16.
It’s hard to estimate the benefit from medical debt reduction. Let’s say, for the sake of simplicity, that the avoidance of medical treatment and mental health problems derived from struggling with medical debt make people live 0.2% shorter lives (1.92 months if starting out with an 80-year lifespan), and that the debt relief provided eliminates that effect. It follows that preventing 0.002 death-equivalents costs $14.16, and thus preventing one death-equivalent unit of lifespan reduction costs $7080. This is about as cost-effective as GiveWell’s most recommended charities.
This would be huge if true. However, my priors advise me against getting too hopeful. It should be hard to find a charity about as cost-effective as GiveWell’s top charities. RIP has been assessed by Charity Navigator, and does a fair bit of marketing. It would be weird if no EA had picked this up before. I have reason to believe that I am overestimating the positive effects of debt relief.
To find out whether RIP is really so effective, it would be great to have numbers on the welfare outcomes of debt relief. I found this report on RIP’s site, which while a potentially useful qualitative source, makes no effort to quantify outcomes.

Chilean AIS Hackathon Retrospective

Agustín Covarrubias 🔸May 9, 2023, 1:34 AM

67 points

0 comments5 min readEA link

An aspirationally comprehensive typology of future locked-in scenarios

Milan Weibel🔹Apr 3, 2023, 2:11 AM

12 points

0 comments4 min readEA link

ChatGPT understands, but largely does not generate Spanglish (and other code-mixed) text

Milan Weibel🔹Jan 4, 2023, 10:10 PM

6 points

0 comments4 min readEA link

(www.lesswrong.com)

Milan Weibel🔹Dec 20, 2022, 12:00 AM
3 points
1 ∶ 0
on: MORE Effective Altruism
Interesting. I agree that second or third-order effects such that as the good done later by people you have helped are an important consideration. Maximising such effects could be an underexplored effective giving strategy, and this organization you refer to looks like a group of people trying to do that. However, to really assess an organization’s effectiveness, epecially if it focuses in educational or social interventions, some empirical evidence is needed.
- Does SENG follow-up on the outcomes of aid recipients?
  - How do they compare with those of similar people in similar situations, but who didn’t recieve help?
- What programs does SENG run?
  - How much does each cost per recipient helped?

Milan Weibel🔹 24 Nov 2022 22:53 UTC
6 points
1 ∶ 0
in reply to: Jeremy’s comment on: Benefits/Risks of Scott Aaronson’s Orthodox/Reform Framing for AI Alignment
Having thought more about this, I suppose you can divide opinions into two clusters and be pointing at something real. That’s because people’s views on different aspects of the issue correlate, often in ways that make sense. For instance, people who think AGI will be achieved by scaling up current (or very similar to current) neural net architectures are more excited about practical alignment research on existing models.
However, such clusters would be quite broad. My main worry is that identifying two particular points as prototypical of them would narrow their range. People would tend to let their opinions drift closer to the point closest to them. This need not be caused by tribal dynamics. It could be something as simple as availability bias. This narrowing of the clusters would likely be harmful, because the AI safety field is quite new and we’ve still got exploring to do. Another risk is that we may become too focused on the line between the two points, neglecting other potentially more worthwhile axes of variation.
If I were to divide current opinions into two clusters, I think that Scott’s two points would in fact fall in different clusters. They would probably even be not too far off their centers of mass. However, I strongly object to pretending the clusters are points, and then getting tribal about it. I think labeling clusters could be useful, if we made it clear that they are still clusters.
On the paths to understanding AI risk without accepting weird arguments, maybe getting people worried about ML unexplainability may be worthwhile to explore, though I suspect most people would think you were pointing to algorithmic bias and the like.

Milan Weibel🔹 21 Nov 2022 19:18 UTC
7 points
2 ∶ 1
on: Benefits/Risks of Scott Aaronson’s Orthodox/Reform Framing for AI Alignment
As a factual question, I’m not sure if people’s opinions on the shape of AI risk can be divided into two distinct clusters, or even distributed along a spectrum (that is, that factor analysis on the points of opinion-space would find a good general factor), though I suspect it may quite weakly be the case. For instance, I found myself agreeing with six of the statements on one side of Scott’s dichotomy and two on the other.
As a public epistemic health question, I think issuing binary labels is harmful for further progress in the field, especially if they borrow terminology from religious groups and the author identifies with one of the proposed camps in the same post he raises the distinction. See the comment by xarkn on LW.
Even if the range of current opinions could be well-described by a single general factor, we should certainly use less divisive terminology for such a spectrum and be mindful that truth may well lie orthogonal to it.

Milan Weibel🔹 10 Jul 2022 15:14 UTC
4 points
0 ∶ 0
on: A list of EA-related podcasts
Un equilibrio inadecuado (Spotify—Apple Podcasts—Google Podcasts)
Interviews in Spanish on EA topics. I particularly enjoyed the episode with Andrés Gómez Emilsson from Qualia Research Institute. Sadly, no new content since October 2021.

Milan Weibel🔹

Contra hard moral anti-realism: a rough sequence of claims

Milan Weibel’s Quick takes

Chilean AIS Hackathon Retrospective

An as­pira­tionally com­pre­hen­sive ty­pol­ogy of fu­ture locked-in scenarios

ChatGPT un­der­stands, but largely does not gen­er­ate Span­glish (and other code-mixed) text

An aspirationally comprehensive typology of future locked-in scenarios

ChatGPT understands, but largely does not generate Spanglish (and other code-mixed) text