William the Kiwi

Karma: 42

Hi I’m William and I am new to the Effective Altruism community.

William comes from a country in the Pacific called New Zealand. He was educated at the University of Otago where he received a first class honours degree in chemistry. He is currently traveling through Europe to learn more about different cultures and ideas.

William the Kiwi 14 Mar 2023 21:22 UTC
4 points
0 ∶ 0
on: Open Thread: January — March 2023
Hi there everyone, I’m William the Kiwi and this is my first post on EA forums. I have recently discovered AI alignment and have been reading about it for around a month. This seems like an important but terrifyingly under invested in field. I have many questions but in the interest of speed I will involve Cunningham’s Law and post my current conclusions.

My AI conclusions:
1. Corrigiblity is mathematically impossible for AGI.
2. Alignment requires defining all important human values in a robust enough way that it can survive near-infinite amounts of optimisation pressure exerted by a superintelligent AGI. Alignment is therefore difficult.
3. Superintelligence by Nick Bostrum is a way of communicating the antimeme “unaligned AI is dangerous” to the general public.
4. The extinction of humanity is a plausible outcome of unaligned AI.
5. Eliezer Yudkowsky seems overly pessimistic but likely correct about most things he says.
6. Humanity is likely to produce AGI before it produces fully aligned AI.
7. To incentivize responses to this post I should offer a £1000 reward for a response that supports or refutes each of these conclusions and provides evidence for it.
I am currently visiting England and would love to talk more about this topic with people, either over the Internet or in person.

William the Kiwi 15 Mar 2023 10:35 UTC
3 points
0 ∶ 0
in reply to: Robi Rahman’s comment on: Open Thread: January — March 2023
Hi Robi Rahman, thanks for the welcome.

I do not know if has a predefined utility function, or if the functions simply have similar forms. If there is a utility function that provides utility for the AI to shutdown if some arbitrary “shutdown button” is pressed, then there exists a state where the “shutdown button” is being pressed at a very high probability (e.g. an office intern is in the process of pushing the “shutdown button”) that provides more expected utility than the current state. There is therefore an incentive for the AI to move towards that state (e.g. by convincing the office intern to push the “shutdown button”). If instead there was negative utility in the “shutdown button” being pressed, the AI is incentivized to prevent the button from being pressed. If instead the AI had no utility function for whether the “shutdown button” was pressed or not, but there somehow existed a code segment that caused the shutdown process to happen if the “shutdown button” was pressed, then there existed a daughter AGI that has slightly more efficient code if this code segment is omitted. An AGI that has a utility function that provides utility for producing daughter AGIs that are more efficient versions of itself, is incentivized to produce such a daughter that has the “shutdown button” code segment removed.

There is a more detailed version of this description in https://intelligence.org/files/Corrigibility.pdf

I could be wrong about my conclusion about corrigiblity (and probably am), however it is my best intuition at this point.

William the Kiwi 15 Mar 2023 10:42 UTC
3 points
0 ∶ 0
on: You Should Write a Forum Bio
Hi I’m new to EA and have just written a bio. Thank you Aaron for encouraging me to do so.

William the Kiwi 12 Oct 2023 15:48 UTC
5 points
0 ∶ 0
on: Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope
I agree with most of your conclusions in this post. I feel uncomfortable. I’ll write more once I have processed some of my emotions and can think in a more clear manner.

William the Kiwi 16 Oct 2023 23:24 UTC
3 points
1 ∶ 0
in reply to: Chris Leong’s comment on: Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope
Hi Chris thanks for reaching out. Obviously things with the world aren’t ok, it seems insane that every country is staring down a massive national security risk and they haven’t done much about it.

How is movement building going?

William the Kiwi 16 Oct 2023 23:26 UTC
1 point
1 ∶ 0
in reply to: Greg_Colbourn’s comment on: Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope
Yea, thanks for the talk Greg, it was informative.

William the Kiwi 17 Oct 2023 1:38 UTC
1 point
0 ∶ 0
in reply to: titotal’s comment on: Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope
I would agree that “utopia in our lifetime” or “extinction” seems like a false dichotomy. What makes you say that you predict the bulk of the probability lies somewhere in the middle?

William the Kiwi 17 Oct 2023 1:52 UTC
5 points
1 ∶ 0
in reply to: Greg_Colbourn’s comment on: Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope
GPT4 is clearly above the median human when it comes to a range of exams. Do we have examples of GPT4′s comparison to the median human in non-exam like conditions?

William the Kiwi 17 Oct 2023 1:54 UTC
3 points
1 ∶ 0
in reply to: Greg_Colbourn’s comment on: Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope
I would agree, relying on pre-GPT4 estimates seems flawed.

William the Kiwi 17 Oct 2023 1:59 UTC
−1 points
1 ∶ 0
in reply to: Isaac Dunn’s comment on: Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope
What part of Greg writing comes across as over confident?

The theoretical computational limit of the Solar System is 1.47x10^49 bits per second.

William the Kiwi17 Oct 2023 2:52 UTC

12 points

7 comments1 min readEA link

William the Kiwi 17 Oct 2023 15:42 UTC
3 points
0 ∶ 0
in reply to: Vasco Grilo’s comment on: The theoretical computational limit of the Solar System is 1.47x10^49 bits per second.
Yea, I found him to be a fascinating person when I talked to him at EAGx Warsaw.

I’m initially sceptical of getting 40% of the mass-energy out of, well, anything. Perhaps I would benefit from reading more on black holes.

However I would in principle agree with the idea that if black holes are feasible power outputers, this would increase the theoretical maximum computation rate.

William the Kiwi 17 Oct 2023 16:30 UTC
1 point
0 ∶ 0
in reply to: Mo Putera’s comment on: The theoretical computational limit of the Solar System is 1.47x10^49 bits per second.
Hi Mo, thanks for the feedback.
1. Good thought, I’ve cross-posted it to my account there.
2. This post was spurred by a conversation I had about the upper limit of AI intelligence and the fact that it was likely very far above all humans combined. This is meant as, like you said, a pretty unobjectionable support for my then assumed conclusion. The conversion was heavily influenced by Cotra’s Bioanchors report.
3. I was estimating the brains computation ability very roughly. I guessed that there would be more detailed estimations already done, but would take time to read through and understand their premises. I’ll read through the document when I have some time.
4. These two look interesting to read.
5. Anders Sandberg is an interesting person. I speculated someone had done calculations similar to mine, I’m not surprised that he is one of such people.

William the Kiwi 18 Oct 2023 11:42 UTC
1 point
0 ∶ 0
in reply to: Mo Putera’s comment on: The theoretical computational limit of the Solar System is 1.47x10^49 bits per second.
An efficient idea, good thinking.

William the Kiwi 8 Feb 2024 9:18 UTC
5 points
1 ∶ 3
in reply to: Remmelt’s comment on: Why I think it’s net harmful to do technical safety research at AGI labs
“This distinction between ‘capabilities’ research and ‘safety’ research is extremely fuzzy, and we have a somewhat poor track record of predicting which areas of research will be beneficial for safety work in the future. This suggests that work that advances some (and perhaps many) kinds of capabilities faster may be useful for reducing risks.”
This seems like a absurd claim. Are 80k actually making it?
EDIT: the claim is made by Benjamin Hilton, one of 80k’s analysts and the person the OP is replying too.

William the Kiwi 8 Feb 2024 10:09 UTC
8 points
1 ∶ 1
on: Why I think it’s net harmful to do technical safety research at AGI labs
Of the four reasons you listed, reason 4 (safety washing) seems the most important. Safety-washing, alongside the related ethics-washing and green-washing are effective techniques that industry uses to increase peoples perception of the industry. Lizka wrote a post on this. These techniques are used by many industries, particularly by industries that produce significant externalities such as the oil industry. These techniques are used because they work, because they give people an out. It is easier to think about the shiny flowers on an ad than it is to think about the reality of an industry killing people.
Safety-washing of AI is harmful as it gives people an out, a chance to repeat the line “well at least they are allegedly doing some safety stuff”, which is a convenient distraction from the fact that AI labs are knowingly developing a technology that can cause human extinction. This distraction causes otherwise safety-conscious people to invest in or work in an industry that they would reconsider if they had access to all the information. By pointing out this distraction, we can help people make more informed decisions.

William the Kiwi 8 Feb 2024 10:18 UTC
2 points
0 ∶ 0
in reply to: Owen Cotton-Barratt’s comment on: Why I think it’s net harmful to do technical safety research at AGI labs
I would agree that this is a good summary:
Improving the quality/quantity of output from safety teams within AI labs has a (much) bigger impact on perceived safety of the lab than it does on actual safety of the lab. This is therefore the dominant term in the impact of the team’s work. Right now it’s negative.
If perception of safety is higher than actual safety, it will lead to underinvestment of future safety, which increases the probability of failure of the system.

William the Kiwi 8 Feb 2024 10:30 UTC
2 points
0 ∶ 0
in reply to: Remmelt’s comment on: Why I think it’s net harmful to do technical safety research at AGI labs
I would agree with Remmelt here. While upskilling people is helpful, if those people then go on to increase the rate of capabilities gain by AI companies, this is reducing the time the world has available to find solutions to alignment and AI regulation.

While, as a rule, I don’t disagree with an industries increasing their capabilities, I do disagree with this when those capabilities knowingly lead to human extinction.

William the Kiwi

The the­o­ret­i­cal com­pu­ta­tional limit of the So­lar Sys­tem is 1.47x10^49 bits per sec­ond.

The theoretical computational limit of the Solar System is 1.47x10^49 bits per second.