Daniel_Friedrich

Karma: 200

Exploring how cognitive science can improve AI safety, governance and prioritization.

I’d be excited to intern for any research project.

Always happy to chat!

Daniel_Friedrich Dec 27, 2023, 10:34 AM
2 points
0 ∶ 0
on: PhD on Moral Progress—Bibliography Review
Looking forward to the sequel!

I’d be particularly interested in any takes on the probability that civilization will be better equipped to deal with the alignment problem in, say, 100 years. My impression is that there’s an important and not well-examined balance between:
1. Decreasing runaway AI risk & systemic risks by slowing down AI
2. Increasing the time of perils
  - Possibly increasing its intensity by giving malicious actors more time to catch up in destructive capabilities
  - But also possibly increasing the time for reflection on defense before a worse time of perils.
3. Possibly decreasing the risk of an aligned AI with bad moral values (conditional on this risk being lower in year 2123)
4. Possibly increasing the risk of astronomic waste (conditional on this risk being higher if AI is significantly slowed down)

Daniel_Friedrich Dec 24, 2023, 11:23 AM
4 points
1 ∶ 1
in reply to: RyanCarey’s comment on: Attention on AI X-Risk Likely Hasn’t Distracted from Current Harms from AI
The idea of existential risk cuts against the oppression/justice narrative, in that it could kill everyone equally. So they have to opposite it.
That seems like an extremely unnatural thought process. Climate change is the perfect analogy—in these circles, it’s salient both as a tool of oppression and an x-risk.
I think far more selection of attitudes happens through paying attention to more extreme predictions, rather than through thinking / communicating strategically. Also, I’d guess people who spread these messages most consciously imagine a systemic collapse, rather than a literal extinction. As people don’t tend to think about longtermistic consequences, the distinction doesn’t seem that meaningful.
AI x-risk is more weird and terrifying and it goes against the heuristics that “technological progress is good”, “people have always feared new technologies they didn’t understand” and “the powerful draw attention away from their power”. Some people, for whom AI x-risk is hard to accept happen to overlap with AI ethics. My guess is that the proportion is similar in the general population—it’s just that some people in AI ethics feel particularly strong & confident about these heuristics.
Btw I think climate change could pose an x-risk in the broad sense (incl. 2nd-order effects & astronomic waste), just one that we’re very likely to solve (i.e. the tail risks, energy depletion, biodiversity decline or the social effects would have to surprise us).

Daniel_Friedrich Dec 22, 2023, 3:33 PM
14 points
2 ∶ 0
on: Attention on AI X-Risk Likely Hasn’t Distracted from Current Harms from AI
Great to see real data on the web interest! In the past weeks, I investigated the same topic myself, while taking a psychological perspective & paying attention to the EU AI act, reaching the same conclusion (just published here).

More evidence X-risk amplifies action against current AI harms

Daniel_FriedrichDec 22, 2023, 3:21 PM

27 points

2 comments2 min readEA link

(osf.io)

Daniel_Friedrich Nov 27, 2023, 12:15 PM
1 point
0 ∶ 0
in reply to: Pablo’s comment on: Daniel_Friedrich’s Shortform
Sorry, I don’t have any experience with that.

Daniel_Friedrich Nov 24, 2023, 1:25 PM
1 point
0 ∶ 0
in reply to: Daniel_Friedrich’s comment on: EA Forum feature suggestion thread
I recently made RatSearch for this purpose. You can also try the GPT bot version (more information here).

Daniel_Friedrich Nov 24, 2023, 12:39 PM
10 points
1 ∶ 0
on: Daniel_Friedrich’s Shortform
Recently, I made RatSearch for googling within EA-adjecent webs. Now, you can try the GPT bot version! (GPT plus required)
The bot is instructed to interpret what you want to know in relation to EA and search for it on the Forums. If it fails, it searches through the whole web, while prioritizing the orgs listed by EA News.
Cons: ChatGPT uses Bing, which isn’t entirely reliable when it comes to indexing less visited webs.
Pros: It’s fun for brainstorming EA connections/perspective, even when you just type a raw phrase like “public transport” or “particle physics”
Neutral: I have yet to experiment whether it works better when you explicitly limit the search using the site: operator—try AltruSearch 2. It seems better at digging deeper within the EA ecosystem; AltruSearch 1 seems better at digging wider.

Update (12/8): The link now redirects to an updated version with very different instructions. You can still access the older version here.
What links here?

Daniel_Friedrich Nov 3, 2023, 8:04 PM
1 point
0 ∶ 0
in reply to: Martin (Huge) Vlach’s comment on: A tool for searching rationalist & EA webs
My intention was to make any content published by OpenAI accessible

Daniel_Friedrich Nov 3, 2023, 1:10 PM
2 points
0 ∶ 0
in reply to: Martin (Huge) Vlach’s comment on: A tool for searching rationalist & EA webs
Yes, OpenAI’s domain name is in the list because they have a blog

Daniel_Friedrich Oct 1, 2023, 11:59 AM
1 point
0 ∶ 0
in reply to: DAOMaximalist’s comment on: A tool for searching rationalist & EA webs
Thanks, I’ve changed it up

A tool for searching rationalist & EA webs

Daniel_FriedrichSep 29, 2023, 3:20 PM

11 points

8 comments1 min readEA link

(ratsearch.blogspot.com)

Daniel_Friedrich Sep 26, 2023, 8:53 AM
1 point
0 ∶ 0
on: Is Mineral Resource Scarcity a Risk Anyone is Researching? Worth Researching?
I’ve just put together a collection of related resources. Fossil fuel depletion is the only mineral resource suggested to have longtermist sugnificance in WWOTF. Metals can be efficiently recycled for long enough that I expect us to develop AGI/nanotechnology before their depletion could start to become problematic. Recycling uranium would be quite advantageous, but I’d be skeptical regarding its tractability and it seems we’ll get by with renewable energy.

Daniel_Friedrich Sep 26, 2023, 8:39 AM
3 points
0 ∶ 0
on: Are there any evaluations or impact assessments of circular economy or related initiatives?
I’ve just put together a post collecting related articles here.

A quick review of resource depletion, waste and overpopulation

Daniel_FriedrichSep 25, 2023, 9:11 PM

24 points

0 comments12 min readEA link

Daniel_Friedrich May 13, 2023, 11:23 AM
2 points
0 ∶ 0
on: AI safety and consciousness research: A brainstorm
Update: I’m pleased to learn Yudkowsky seems to have suggested a similar agenda in a recent interview with Dwarkesh Patel (timestamp) as his greatest source of predictable hope about AI. It’s a rather fragmented bit but the gist is: Perhaps people doing RLHF get a better grasp on what to aim for by studying where “niceness” comes from in humans. He’s inspired by the idea that “consciousness is when the mask eats the shoggoth” and suggests, “maybe with the right bootstrapping you can let that happen on purpose”.
I see a very important point here: Human intelligence isn’t misaligned with evolution in a random direction, it is misaligned in the direction of maximizing positive qualia. Therefore, it seems very likely that consciousness played a causal role in the evolution of human moral alignment—and such causal role needs to be possible to study.

Daniel_Friedrich May 10, 2023, 9:02 AM
5 points
1 ∶ 0
on: EA Forum feature suggestion thread
Suggestion: Integrated search in LessWrong, EA Forum, Alignment Forum and perhaps Progress Forum posts.

Daniel_Friedrich Apr 2, 2023, 10:49 AM
5 points
1 ∶ 0
on: Recruit the World’s best for AGI Alignment
1. If Big Tech finds these kinds of salaries cost-effective to solve their problems, I would consider it a strong argument in favor of this project.
2. I imagine Elon Musk could like this project given that he believes in small effective teams of geniuses.
3. I’d say “polymaths” is a good label for people I’d expect to make progress like Yudkowsky, Bostrom, Hanson and von Neumann.
  1. Edit: This may be fame-selection (engineers don’t often get credit, particularly in teams) or self-selection (interest in math+society).
4. The Manhattan and Enigma projects seem like examples where this kind of strategy just worked out. Some consideration that come to mind:
  1. There could be selection effects.
  2. From what I can find, members of these teams weren’t lured in by a lot of money. However, the salience of the AI threat in society is tiny, compared to that of WWII and large incentives could compensate that.
  3. I’ve read money can sometimes decrease intrinsic motivation, that drives exploration & inventions, however these findings are being rebutted by newer studies. Apart from that, my guess would be that getting those teams together is the key part and if large money can facilitate that, great.
5. A wild idea that might help in case a similar phenomenon works in the sub-population of geniuses & which could make this project more appealing to donors: Limit a portion of these salaries, so that the recipients could only use them for socially beneficial uses.

AI safety and consciousness research: A brainstorm

Daniel_Friedrich15 Mar 2023 14:33 UTC

11 points

1 comment9 min readEA link

Daniel_Friedrich 26 Feb 2023 11:52 UTC
3 points
0 ∶ 0
on: Daniel_Friedrich’s Shortform
I got access to Bing Chat. It seems:
- It only searches through archived versions of websites (it doesn’t retrieve today’s news articles, it accessed an older version of my Wikipedia user site)
- During archivation, it only downloads the content one can see without any engagement with the website (tested on Reddit “see spoiler” buttons which reveal new content in the code. It could retrieve info from posts that gained less attention but weren’t hidden behind the spoiler button)
I. e. it’s still in a box of sorts, unless it’s much more intelligent than it pretends.
Edit: A recent ACX post argues text-predicting oracles might be safer, as their ability to form goals is super limited, but it provides 2 models how even they could be dangerous: By simulating an agent or via a human who decides to take bad advice like “run the paperclip maximizer code”. Scott implies thinking it would spontaneously form goals is extreme, linking a post by Veedrac. The best argument there seems to be: It only has memory equivalent to 10 human seconds. I find this convincing for the current models but it also seems limiting for the intelligence of these systems, so I’m afraid for future models, the incentives are aligned with reducing this safety valve.

Daniel_Friedrich 10 Jan 2023 20:49 UTC
1 point
0 ∶ 0
on: What AI Take-Over Movies or Books Will Scare Me Into Taking AI Seriously?
For me, the easiest to imagine model of how an AI takeover could look like has been depicted in Black Mirror: Shut Up and Dance (the episodes are fully independent stories). It’s probably just meant to show scary things humans can do with current technology, but such schemes could be trivial for a superintelligence with future technology.

Daniel_Friedrich

More ev­i­dence X-risk am­plifies ac­tion against cur­rent AI harms

A tool for search­ing ra­tio­nal­ist & EA webs

A quick re­view of re­source de­ple­tion, waste and overpopulation

AI safety and con­scious­ness re­search: A brainstorm

More evidence X-risk amplifies action against current AI harms

A tool for searching rationalist & EA webs

A quick review of resource depletion, waste and overpopulation

AI safety and consciousness research: A brainstorm