Hi I’m Steve Byrnes, an AGI safety / AI alignment researcher in Boston, MA, USA, with a particular focus on brain algorithms. See https://sjbyrnes.com/agi.html for a summary of my research and sorted list of writing. Physicist by training. Email: steven.byrnes@gmail.com. Leave me anonymous feedback here. I’m also at: RSS feed , Twitter , Mastodon , Threads , Bluesky , GitHub , Wikipedia , Physics-StackExchange , LinkedIn
Steven Byrnes
Since “number of individual donations” (ideally high) and “average size of donations” (ideally low) seem to be frequent talking points among candidates and the press, and also relevant to getting into debates (I think), it seems like there may well be a good case for giving a token $1 to your preferred candidate(s). Very low cost and pretty low benefit. The same could be said for voting. But compared to voting, token $1 donations are possibly more effective (especially early in the process), and definitely less time-consuming.
This blog post suggests (based on Google Search Trends) that other coronavirus infections have typically gone down steadily over the course of March and April. (Presumably the data is dominated by the northern hemisphere.)
Update: this blog post is a much better-informed discussion of warm weather.
Again, this remark seems explicitly to assume that the AI is maximising some kind of reward function. Humans often act not as maximisers but as satisficers, choosing an outcome that is good enough rather than searching for the best possible outcome. Often humans also act on the basis of habit or following simple rules of thumb, and are often risk averse. As such, I believe that to assume that an AI agent would be necessarily maximising its reward is to make fairly strong assumptions about the nature of the AI in question. Absent these assumptions, it is not obvious why an AI would necessarily have any particular reason to usurp humanity.
Imagine that, when you wake up tomorrow morning, you will have acquired a magical ability to reach in and modify your own brain connections however you like.
Over breakfast, you start thinking about how frustrating it is that you’re in debt, and feeling annoyed at yourself that you’ve been spending so much money impulse-buying in-app purchases in Farmville. So you open up your new brain-editing console, look up which neocortical generative models were active the last few times you made a Farmville in-app purchase, and lower their prominence, just a bit.
Then you take a shower, and start thinking about the documentary you saw last night about gestation crates. ‘Man, I’m never going to eat pork again!’ you say to yourself. But you’ve said that many times before, and it’s never stuck. So after the shower, you open up your new brain-editing console, and pull up that memory of the gestation crate documentary and the way you felt after watching it, and set that memory and emotion to activate loudly every time you feel tempted to eat pork, for the rest of your life.
Do you see the direction that things are going? As time goes on, if an agent has the power of both meta-cognition and self-modification, any one of its human-like goals (quasi-goals which are context-dependent, self-contradictory, satisficing, etc.) can gradually transform itself into a utility-function-like goal (which is self-consistent, all-consuming, maximizing)! To be explicit: during the little bits of time when one particular goal happens to be salient and determining behavior, the agent may be motivated to “fix” any part of itself that gets in the way of that goal, until bit by bit, that one goal gradually cements its control over the whole system.
Moreover, if the agent does gradually self-modify from human-like quasi-goals to an all-consuming utility-function-like goal, then I would think it’s very difficult to predict exactly what goal it will wind up having. And most goals have problematic convergent instrumental sub-goals that could make them into x-risks.
...Well, at least, I find this a plausible argument, and don’t see any straightforward way to reliably avoid this kind of goal-transformation. But obviously this is super weird and hard to think about and I’m not very confident. :-)
(I think I stole this line of thought from Eliezer Yudkowsky but can’t find the reference.)
Everything up to here is actually just one of several lines of thought that lead to the conclusion that we might well get an AGI that is trying to maximize a reward.
Another line of thought is what Rohin said: We’ve been using reward functions since forever, so it’s quite possible that we’ll keep doing so.
Another line of thought is: We humans actually have explicit real-world goals, like curing Alzheimer’s and solving climate change etc. And generally the best way to achieve goals is to have an agent seeking them.
Another line of thought is: Different people will try to make AGIs in different ways, and it’s a big world, and (eventually by default) there will be very low barriers-to-entry in building AGIs. So (again by default) sooner or later someone will make an explicitly-goal-seeking AGI, even if thoughtful AGI experts pronounce that doing so is a terrible idea.
A nice short argument that a sufficiently intelligent AGI would have the power to usurp humanity is Scott Alexander’s Superintelligence FAQ Section 3.1.
I thought “taking tail risks seriously” was kinda an EA thing...? In particular, we all agree that there probably won’t be a coup or civil war in the USA in early 2021, but is it 1% likely? 0.001% likely? I won’t try to guess, but it sure feels higher after I read that link (including the Vox interview) … and plausibly high enough to warrant serious thought and contingency planning.
At least, that’s what I got out of it. I gave it a bit of thought and decided that I’m not in a position that I can or should do anything about it, but I imagine that some readers might have an angle of attack, especially given that it’s still 6 months out.
Thanks for writing this up!!
Although I have not seen the argument made in any detail or in writing, I and the Future of Life Institute (FLI) have gathered the strong impression that parts of the effective altruism ecosystem are skeptical of the importance of the issue of autonomous weapons systems.
I’m aware of two skeptical posts on EA Forum (by the same person). I just made a tag Autonomous Weapons where you’ll find them.
Just a little thing, but my impression is that CPUs and GPUs and FPGAs and analog chips and neuromorphic chips and photonic chips all overlap with each other quite a bit in the technologies involved (e.g. cleanroom photolithography), as compared to quantum computing which is way off in its own universe of design and build and test and simulation tools (well, several universes, depending on the approach). I could be wrong, and you would probably know better than me. (I’m a bit hazy on everything that goes into a “real” large-scale quantum computer, as opposed to 2-qubit lab demos.) But if that’s right, it would argue against investing your time in quantum computing, other things equal. For my part, I would put like <10% chance that the quantum computing universe is the one that will create AGI hardware and >90% that the CPU/GPU/neuromorphic/photonic/analog/etc universe will. But who knows, I guess.
I’m a physicist at a US defense contractor, I’ve worked on various photonic chip projects and neuromorphic chip projects and quantum projects and projects involving custom ASICs among many other things, and I blog about safe & beneficial AGI as a hobby … I’m happy to chat if you think that might help, you can DM me :-)
My understanding is that (1) to deal with the paperwork etc. for grants from governments or government-like bureaucratic institutions, you need to be part of an institution that’s done it before; (2) if the grantor is a nonprofit, they have regulations about how they can use their money while maintaining nonprofit status, and it’s very easy for them to forward the money to a different nonprofit institution, but may be difficult or impossible for them to forward the money to an individual. If it is possible to just get a check as an individual, I imagine that that’s the best option. Unless there are other considerations I don’t know about.
Btw Theiss is another US organization in this space.
- 20 Nov 2021 1:03 UTC; 17 points) 's comment on How To Get Into Independent Research On Alignment/Agency by (LessWrong;
Theiss was very much active as of December 2020. They’ve just been recruiting so successfully through word-of-mouth that they haven’t gotten around to updating the website.
I don’t think healthcare and taxes undermine what I said, at least not for me personally. For healthcare, individuals can buy health insurance too. For taxes, self-employed people need to pay self-employment tax, but employees and employers both have to pay payroll tax which adds up to a similar amount, and then you lose the QBI deduction (this is all USA-specific), so I think you come out behind even before you account for institutional overhead, and certainly after. Or at least that’s what I found when I ran the numbers for me personally. It may be dependent on income bracket or country so I don’t want to over-generalize...
That’s all assuming that the goal is to minimize the amount of grant money you’re asking for, while holding fixed after-tax take-home pay. If your goal is to minimize hassle, for example, and you can just apply for a bit more money to compensate, then by all means join an institution, and avoid the hassle of having to research health care plans and self-employment tax deductions and so on.
I could be wrong or misunderstanding things, to be clear. I recently tried to figure this out for my own project but might have messed up, and as I mentioned, different income brackets and regions may differ. Happy to talk more. :-)
For what it’s worth, I generally downvote a post only when I think “This post should not have been written in the first place”, and relatedly I will often upvote posts I disagree with.
If that’s typical, then the “controversial” posts you found may be “the most meta-level controversial” rather than “the most object-level controversial”, if you know what I mean.
That’s still interesting though.
Just one guy, but I have no idea how I would have gotten into AGI safety if not for LW … I had a full-time job and young kids and not-obviously-related credentials. But I could just come out of nowhere in 2019 and start writing LW blog posts and comments, and I got lots of great feedback, and everyone was really nice. I’m full-time now, here’s my writings, I guess you can decide whether they’re any good :-P
I feel like that guy’s got a LOT of chutzpah to not-quite-say-outright-but-very-strongly-suggest that the Effective Altruism movement is a group of people who don’t care about the Global South. :-P
More seriously, I think we’re in a funny situation where maybe there are these tradeoffs in the abstract, but they don’t seem to come up in practice.
Like in the abstract, the very best longtermist intervention could be terrible for people today. But in practice, I would argue that most if not all current longtermist cause areas (pandemic prevention, AI risk, preventing nuclear war, etc.) are plausibly a very good use of philanthropic effort even if you only care about people alive today (including children).
Or, in the abstract, AI risk and malaria are competing for philanthropic funds. But in practice, a lot of the same people seem to care about both, including many of the people that the article (selectively) quotes. …And meanwhile most people in the world care about neither.
I mean, there could still be an interesting article about how there are these theoretical tradeoffs between present and future generations. But it’s misleading to name names and suggest that those people would gleefully make those tradeoffs, even if it involves torturing people alive today or whatever. Unless, of course, there’s actual evidence that they would do that. (The other strong possibility is, if actually faced with those tradeoffs in real life, they would say, “Uh, well, I guess that’s my stop, this is where I jump off the longtermist train!!”).
Anyway, I found the article extremely misleading and annoying. For example, the author led off with a quote where Jaan Tallinn says directly that climate change might be an existential risk (via a runaway scenario), and then two paragraphs later the author is asking “why does Tallinn think that climate change isn’t an existential risk?” Huh?? The article could have equally well said that Jaan Tallinn believes that climate change is “very plausibly an existential risk”, and Jaan Tallinn is the co-founder of an organization that does climate change outreach among other things, and while climate change isn’t a principal focus of current longtermist philanthropy, well, it’s not like climate change is a principal focus of current cancer research philanthropy either! And anyway it does come up to a reasonable extent, with healthy discussions focusing in particular on whether there are especially tractable and neglected things to do.
So anyway, I found the article very misleading.
(I agree with Rohin that if people are being intimidated, silenced, or cancelled, then that would be a very bad thing.)
Hmm, I guess I wasn’t being very careful. Insofar as “helping future humans” is a different thing than “helping living humans”, it means that we could be in a situation where the interventions that are optimal for the former are very-sub-optimal (or even negative-value) for the latter. But it doesn’t mean we must be in that situation, and in fact I think we’re not.
I guess if you think: (1) finding good longtermist interventions is generally hard because predicting the far-future is hard, but (2) “preventing extinction (or AI s-risks) in the next 50 years” is an exception to that rule; (3) that category happens to be very beneficial for people alive today too; (4) it’s not like we’ve exhausted every intervention in that category and we’re scraping the bottom of the barrel for other things … If you believe all those things, then in that case, it’s not really surprising if we’re in a situation where the tradeoffs are weak-to-nonexistent. Maybe I’m oversimplifying, but something like that I guess?
I suspect that if someone had an idea about an intervention that they thought was super great and cost effective for future generations and awful for people alive today, well they would probably post that idea on EA Forum just like anything else, and then people would have a lively debate about it. I mean, maybe there are such things...Just nothing springs to my mind.
It’s possible much of that supposed additional complexity isn’t useful
Yup! That’s where I’d put my money.
It’s a forgone conclusion that a real-world system has tons of complexity that is not related to the useful functions that the system performs. Consider, for example, the silicon transistors that comprise digital chips—”the useful function that they perform” is a little story involving words like “ON” and “OFF”, but “the real-world transistor” needs three equations involving 22 parameters, to a first approximation!
By the same token, my favorite paper on the algorithmic role of dendritic computation has them basically implementing a simple set of ANDs and ORs on incoming signals. It’s quite likely that dendrites do other things too besides what’s in that one paper, but I think that example is suggestive.
Caveat: I’m mainly thinking of the complexity of understanding the neuronal algorithms involved in “human intelligence” (e.g. common sense, science, language, etc.), which (I claim) are mainly in the cortex and thalamus. I think those algorithms need to be built out of really specific and legible operations, and such operations are unlikely to line up with the full complexity of the input-output behavior of neurons. I think the claim “the useful function that a neuron performs is simpler than the neuron itself” is always true, but it’s very strongly true for “human intelligence” related algorithms, whereas it’s less true in other contexts, including probably some brainstem circuits, and the neurons in microscopic worms. It seems to me that microscopic worms just don’t have enough neurons to not squeeze out useful functionality from every squiggle in their neurons’ input-output relations. And moreover here we’re not talking about massive intricate beautifully-orchestrated learning algorithms, but rather things like “do this behavior a bit less often when the temperature is low” etc. See my post Building brain-inspired AGI is infinitely easier than understanding the brain for more discussion kinda related to this.
Addendum: In the other direction, one could point out that the authors were searching for “an approximation of an approximation of a neuron”, not “an approximation of a neuron”. (insight stolen from here.) Their ground truth was a fancier neuron model, not a real neuron. Even the fancier model is a simplification of real life. For example, if I recall correctly, neurons have been observed to do funny things like store state variables via changes in gene expression. Even the fancier model wouldn’t capture that. As in my parent comment, I think these kinds of things are highly relevant to simulating worms, and not terribly relevant to reverse-engineering the algorithms underlying human intelligence.
Let’s say a human writes code more-or-less equivalent to the evolved “code” in the human genome. Presumably the resulting human-brain-like algorithm would have valence, right? But it’s not a mesa-optimizer, it’s just an optimizer. Unless you want to say that the human programmers are the base optimizer? But if you say that, well, every optimization algorithm known to humanity would become a “mesa-optimizer”, since they tend to be implemented by human programmers, right? So that would entail the term “mesa-optimizer” kinda losing all meaning, I think. Sorry if I’m misunderstanding.
Have you read https://www.cold-takes.com/where-ai-forecasting-stands-today/ ?
I do agree that there are many good reasons to think that AI practitioners are not AI forecasting experts, such as the fact that they’re, um, obviously not—they generally have no training in it and have spent almost no time on it, and indeed they give very different answers to seemingly-equivalent timelines questions phrased differently. This is a reason to discount the timelines that come from AI practitioner surveys, in favor of whatever other forecasting methods / heuristics you can come up with. It’s not per se a reason to think “definitely no AGI in the next 50 years”.
Well, maybe I should just ask: What probability would you assign to the statement “50 years from today, we will have AGI”? A couple examples:
If you think the probability is <90%, and your intention here is to argue against people who think it should be >90%, well I would join you in arguing against those people too. This kind of technological forecasting is very hard and we should all be pretty humble & uncertain here. (Incidentally, if this is who you’re arguing against, I bet that you’re arguing against fewer people than you imagine.)
If you think the probability is <10%, and your intention here is to argue against people who think it should be >10%, then that’s quite a different matter, and I would strongly disagree with you, and I would very curious how you came to be so confident. I mean, a lot can happen in 50 years, right? What’s the argument?
...And even if it could miraculously be prevented from actually causing any local negative weather events in other countries, it would certainly be perceived to do so, because terrible freak droughts/floods/etc. will continue to happen as always, and people will go looking for someone to blame, and the geoengineering project next door will be an obvious scapegoat.
Like how the US government once tried to use cloud-seeding (silver iodide) to weaken hurricanes, and then one time a hurricane seemed to turn sharply and hit Georgia right after being seeded, and everyone blamed the cloud-seeding, and sued, and shut the program down, …even though it was actually a coincidence! (details) (NB: I only skimmed the wikipedia article, I haven’t checked anything)