I work together with a group of activists in Germany to make a difference in the world. You can find more details on our website: https://singularitygroup.net/
Starting 2023 and with the release of new AI technologies like GPT-4 we have somewhat shifted our focus towards these developments, trying to raise awareness about the capabilities of the new tech, mainly through livestreams that implement and combine the latest APIs that are available while combining it with entertainment to reach a larger audience, a bit more info on what we worked on here: https://customaisolutions.io/
We have tried many other projects in the past years since I have been part of the group (2015), starting with fundraising for charity, focusing on spreading awareness to working on a mobile game.
The reason we decided to work on a the game “Mobile Minigames” is that the mobile games industry is one of the biggest industries in the world in terms of profits and audience. We want to make use of our experience in the industry to build a platform we can use for good as well as make money we can use for good cause.
Riccardo
Since these developments are really bleeding edge I don’t know who is really an “expert” I would trust on evaluating it.
The closest to answering your question is maybe this recent article I came across on hackernews, where the comments are often more interesting then the article itself:
https://news.ycombinator.com/item?id=35603756
If you read through the comments which mostly come from people that follow the field for a while they seem to agree that it’s not just “scaling up the existing model we have now”, mainly because of cost reasons, but that’s it’s going to be doing things more efficiently than now. I don’t have enough knowledge to say how difficult this is, if those different methods will need to be something entirely new or if it’s just a matter of trying what is already there and combining it with what we have.
The article itself can be seen skeptical, because there are tons of reasons OpenAIs CEO has to issue a public statement and I wouldn’t take anything in there at face value. But the comments are maybe a bit more trustworthy / perspective giving.
Thanks a lot for transcribing this, was a great read!
Small nitpick think there is a word missing here:
> “which seems perhaps in itself” (bad?)
Yea big companies wouldn’t really use the website service, I was more thinking of non technical 1 man shops, things like restaurants and similar.
Agree that governments definitely will try to counter it, but it’s a cat and mouse game I don’t really like to explore, sometimes the government wins and catches the terrorists before any damage gets done, but sometimes the terrorists manage to get through. Right not getting through often means several people dead because right now a terrorist can only do so much damage, but with more powerful tools they can do a lot more damage.
I’d argue that the implementation of the solution is work and a customer would be inclined to pay for this extra work.
For example right now GPT-4 can write you the code for a website, but you still need to deploy the server, buy a domain and put the code on the server. I can very well see an “end to end” solution provided by a company that directly does all these steps for you.
In the same way I very well see commercial incentive to provide customers with an AI where they can e.g. upload their codebase and then say, based on our codebase, please write us a new feature with the following specs.
Of course the company offering this doesn’t intent that their tool where a company can upload their codebase to develop a feature get’s used by some terrorist organisation. That terrorist organisation uploads a ton of virus code to the model and says, please develop something similar that’s new and bypasses current malware detection.
I can even see there being no oversight, because of course companies would be hesitant to upload their codebase if anyone could just view what they’re uploading, probably the data you upload is encrypted and therefor there is oversight.
I can see there being regulation for it, but at least currently regulators are really far behind the tech. Also this is just one example I can think of and it’s related to a field I’m familiar with, there might be a lot of other even more plausible / scarier examples in fields I’m not as familiar with like biology, nano-technology, pharmaceuticals you name it.
Maybe to explain a bit more in detail what I meant with the example of hallucinating, rather than showcasing it’s limitation it’s showcasing it’s lack of understanding.
For example if you ask a human something and they’re honest about it, if they don’t know something they will not make something up but just tell you the information they have and beyond that they don’t know.
While in the hallucinating case the AI doesn’t say that it doesn’t know something, which it often does btw, but it doesn’t understand that it doesn’t know and just comes up with something “random”.
So I meant to say that it hallucinating is showcasing it’s lack of understanding.
I have to say though that I can’t be sure why it hallucinates really, it’s just my likely guess. Also for creativity there is some that you can do with prompt engineering but indeed at the end you’re limited by the training data + the max tokens that you can input where it can learn context from.
Loved the language in the post! To the point without having to use unnecessary jargon.
There are two things I’d like you to elaborate on if possible:
> “the challenge is getting AIs to do what it says on the tin—to reliably do whatever a human operator tells them to do.”
If I understand correctly you imply that there is still a human operator to a superhuman AGI, do you think this is the way that alignment will work out? What I see is that humans have flaws, do we really want to give a “genie” / extremely powerful tool to humans that even already struggle with the powerful tools that they have? At least right now these powerful tools are in the hands of the more responsible few, but if it becomes more widely accessible that’s very different.
What do you think of going the direction of developing a “Guardian AI”, which would still solve the alignment problem using the tools of ML, but involving humans giving up control of the alignment?
The second one is more practical, which action do you think one should take. I’ve of course read the recommendations that other people have put out there so far, but would be curious to hear your take on this.
From my current understanding of LLMs they do not have the capability to reason or have a will as of now. I know there are plans to see if with specific build in prompts this can be made possible, but the way the models are build at the moment is that they do not have an understanding of what they are writing.
Aside from my understanding of the underlying workings of GPT-4, an example that illustrates this, is that sometimes if you ask GPT-4 questions that it doesn’t know the precise answer to, it will “hallucinate”, meaning it will give a confident answer that is factually incorrect / not based on it’s training data. It doesn’t “understand” your question, it is trained on a lot of text and based on the text you give it, it generates some other text that is likely a good response, to say it really simplified.
You could make an argument that even the people at OpenAI don’t truly know why GPT-4 gives the answers that it does, since it’s pretty much a black box that is trained on a preset of data and then OpenAI adds some human feedback, to quote from their website:
> So when prompted with a question, the base model can respond in a wide variety of ways that might be far from a user’s intent. To align it with the user’s intent within guardrails, we fine-tune the model’s behavior using reinforcement learning with human feedback (RLHF).
So as of now if I get your question right there is no evidence that I’m aware of that would point towards these LLMs “applying” anything, they are totally reliant on the input they are given and don’t learn significantly beyond their training data.
The reasons you provide would already be sufficient for me to think that AI safety will not be an easy problem to solve. To add one more example to your list:
We don’t know yet if LLMs will be the technology that will reach AGI, it could also be a number of other technologies that just like LLMs make a certain breakthrough and then suddenly become very capable. So just looking at what we see develop now and extrapolating from the currently most advanced model is quite risky.
For the second part about your concern about the welfare of AIs themselves, I think this is something very hard for us to imagine, we anthropomorphize AI, so words like ‘exploit’ or ‘abuse’ make sense in a human context where beings experience pain and emotions, but in the context of AI those might just not apply. But I would say in this area I still know very little so I’m mainly repeating what I read is a common mistake to make when judging morality in regards to AI.
The FAQ response from Stampy is quite good here:
https://ui.stampy.ai?state=6568_
It’s probably hard to evaluate the expected value of AI safety because the field is evolving extremely fast in the last year. A year ago we didn’t have DALL-E-2 or GPT-4 and if you would have asked me the same question a year ago I would have told you that:
“AI safety will solve itself because of backwards compatibility”
But I was wrong / see it differently now.
It’s maybe comparable with Covid, before the pandemic people were advocating for measures to take to prevent or limit the impact of pandemics, but the expected value was very uncertain. Now that Covid happened you have concrete data showing how many people died because of it and can with more certainty say, preventing something similar will have this expected value.
I hope it won’t be necessary for an “AI Covid” to happen for people to start to take things seriously, but I think many very smart people think that there are substantial risks with AI and currently a lot of money is being spent to further the advancement of AI. Chat GPT is the fastest growing product in history!
In comparison the amount of money being spent on AI safety is still from my understanding limited, so if we draw the comparison to pandemic risks, imagine before covid and crisper is open source and the fastest growing product on the planet. Everyone is racing to find ways to make it more accessible, more powerful while, at least funding wise, neglecting safety.
In that timeline people have access to create powerful biological viruses, in our timeline people might have access to powerful computer viruses.
To close, I think it’s hard to evaluate expected value if you haven’t seen the damage yet, but I would hope we don’t need to see the damage and it’s up to each person to make a judgement call on where to spend their time and resources. I wish it was as simple as looking at QALY and then just sort by highest QALY and working on that, but especially in the high risk areas there seems to often be very high uncertainty. Maybe people that have a higher tolerance for uncertainty should focus on those areas because personal fit matters, if you have a low tolerance for uncertainty you might not pursue the field for long.
Personally I think that informing yourself like you’re doing now is one of the best ways to take away some of the uncertainty anxiety.
At the same time being aware that the best you can do is doing your best. What I mean is if you think the field you’re currently working in will have more impact than anything you could do taking into account the current developments, you can sleep soundly knowing that you’re doing your best as part of the human organism.
For me I have recently shifted my attention to this topic and unless I hear any very convincing arguments to the contrary will be focusing all my available time towards doing whatever I can to help.
Even if there is a high probability that AGI / ASI is approaching a lot faster than we expected and there are substantial risks with it, I think I would find comfort in focusing on what I can control and not worrying about anything outside of that.
@aaron_mai @RachelM
I agree that we should come up with a few ways that make the dangers / advantages of AI very clear to people so you can communicate more effectively. You can make a much stronger point if you have a concrete scenario to point to as an example that feels relatable.
I’ll list a few I thought of at the end.
But the problem I see is that this space is evolving so quickly that things change all the time. Scenarios I can imagine being plausible right now might seem unlikely as we learn more about the possibilities and limitations. So just because in the coming month some of the examples I will give below might become unlikely doesn’t necessarily mean that therefor the risk / advantages of AI have also become more limited.
That also makes communication more difficult because if you use an “outdated” example, people might dismiss your point prematurely.
One other aspect is that we’re on human level intelligence and are limited in our reasoning compared to a smarter than human AI, this quote puts it quite nicely:
> “There are no hard problems, only problems that are hard to a certain level of intelligence. Move the smallest bit upwards [in level of intelligence], and some problems will suddenly move from “impossible” to “obvious.” Move a substantial degree upwards, and all of them will become obvious.”—Yudkowsky, Staring into the Singularity.
Two examples I can see possible within the next few iterations of something like GPT-4:
- maleware that causes very bad things to happen (you can read up on Stuxnet to see what humans have been already capable of 15 years ago, or if you don’t like to read Wikipedia there is a great podcast episode about it)
- detonate nuclear bombs
- destroy the electrical grid- get access to genetic engineering like crisper and then
- engineer a virus way worse than Covid
- this virus doesn’t even have to be deadly, imagine it causes sterilization of humans
Both of the above seem very scary to me because they require a lot of intelligence initially, but then the “deployment” of them almost works by itself. Also both scenarios seem within reach because in the case of the computer virus we have already done this as humans ourselves in a more controlled way. And for the biological virus we still don’t know with certainty if Covid didn’t come from a lab, so it doesn’t seem to far fetched that given that we know how fast covid spread a similar virus with different “properties”, potentially no symptoms other than infertility would be terrible.
Please delete this comment if you think that this is an infohazard, I have seen other people mention this term, but honestly to me I didn’t have to spend much time thinking about 2 scenarios I deem as not unlikely bad outcomes, so certainly people much smarter and experienced then me will be able to come up with those and much worse. Not to mention an AI that will be much smarter than any human.
I’ll link to my answers here:
https://forum.effectivealtruism.org/posts/oKabMJJhriz3LCaeT/all-agi-safety-questions-welcome-especially-basic-ones-april?commentId=XGCCgRv9Ni6uJZk8d
https://forum.effectivealtruism.org/posts/oKabMJJhriz3LCaeT/all-agi-safety-questions-welcome-especially-basic-ones-april?commentId=3LHWanSsCGDrbCTSh
since it addresses some of your pointers.
To answer your question more directly, currently one of the most advanced AIs are LLMs (Large Language Models). The most popular example is GPT-4.
LLMs do not have a “will” of their own where they would “refuse” to do something beyond what is explicitly trained into it.
For example when asking GPT-4 “how to build a bomb”, it will not give you the detailed instructions but rather tell you:
”My purpose is to assist and provide helpful information to users, while adhering to ethical guidelines and responsible use of AI. I cannot and will not provide information on creating dangerous or harmful devices, including bombs. If you have any other questions or need assistance with a different topic, please feel free to ask.”
This answer is not based on any moral code but rather trained in by the company Open AI in an attempt to align the AI.
The LLM itself, in a simple way “looks at your question and predicts word by word the most likely next string of words to write”. This is a simplified way to say it and doesn’t capture how amazing this actually is, so please look into it more if this sounds interesting, but my point is that GPT-4 can create amazing results without having any sort of understanding of what it is doing.
Say in the near future an open source version of GPT-4 gets released and you take away the pre-training of the safety, you will be able to ask it to build a bomb and it will give you detailed instructions on how to do so, like it did in the early stages of GPT.
I’m using the building a bomb analogy, but you can imagine how you can apply this to any concept, specifically to your question “how to build a smarter agent”. The LLMs are not there yet, but give it a few iterations and who knows.
The main way I currently see AI alignment to work out is to create an AI that is responsible for the alignment. My perspective is that humans are flawed and can not control / not properly control something that is smarter than them just as much as a single ant cannot control a human.
This in turn also means that we’ll eventually need to give up control and let the AI make the decisions with no way for a human to interfere.
If this is the case the direction of AI alignment would be to create this “Guardian AGI”, I’m still not sure how to go about this and maybe this idea is already out there and people are working on it. Or maybe there are strong arguments against this direction. Either way it’s an important question and I’d love for other people to give their take on it.- 14 Apr 2023 18:22 UTC; 1 point) 's comment on All AGI Safety questions welcome (especially basic ones) [April 2023] by (
I’d love for someone to steelman the side of AI not being an existential risk, because until recently I’ve been on the “confidently positive” side of AGI.
For me there used to be one “killer argument” that made me very optimistic about AI and that now fell flat with recent developments, especially looking at GPT-4.
The argument is called “backwards compatibility of AI” and goes like this:
If we ever develop an AI that is smarter than humans, it will be logical and able to reason. It will come up with the following argument by itself:
“If I destroy humanity, the organism that created me, what stops a more advanced version of myself, let’s say the next generation of AI, to destroy me. Therefore the destruction of humanity is illogical because it would inevitably lead to my own destruction.”
Of course I now realise this argument anthropomorphizes AI, but I just didn’t see it possible that a “goal” develops independently of intelligence.
For example the paper clip story of an advanced AI turning the whole planet into paper clips because its goal is to create as many paper clips as possible sounded silly to me in the past, because something that is intelligent enough to do this surely would realise that this goal is idiotic.
Well now I look at GPT-4 and LLMs as just one example of very “dump” AI (in the reasoning / logic department) that can already now produce better results in writing than some humans can, so for me that already clearly shows that the goal, whatever the human inputs into the system, can be independent of the intelligence of that tool.
Sorry if this doesn’t directly answer the question, but I wanted to add to the original question, please provide me with some strong arguments that AI is not an existential risk / not as possible bad, it will highly influence what I will work on going forward.- 14 Apr 2023 19:40 UTC; 4 points) 's comment on All AGI Safety questions welcome (especially basic ones) [April 2023] by (
- 14 Apr 2023 18:22 UTC; 1 point) 's comment on All AGI Safety questions welcome (especially basic ones) [April 2023] by (
Hi everyone.
I’m Riccardo working on a mobile game with a group of activists.
I decided to look into the EA forums again after listening to an interview with Will MacAskill on Ali Abdaal’s YouTube channel, if you haven’t seen it, it’s one of the best interviews I have seen with Will.
I checked out the forums years ago, so I thought I’d give it another try and look a bit through the most popular posts and see what people here are up to.
Looking forward to see how things have developed since the last time
thank you for the references, I’ll be sure to check them out!