#201 – Why your robot butler isn’t here yet (Ken Goldberg on The 80,000 Hours Podcast)

We just published an interview: Ken Goldberg on why your robot butler isn’t here yet. Listen on Spotify or click through for other audio options, the transcript, and related links. Below are the episode summary and some key excerpts.

Episode summary

Perception is quite difficult with cameras: even if you have a stereo camera, you still can’t really build a map of where everything is in space. It’s just very difficult. And I know that sounds surprising, because humans are very good at this. In fact, even with one eye, we can navigate and we can clear the dinner table.

But it seems that we’re building in a lot of understanding and intuition about what’s happening in the world and where objects are and how they behave. For robots, it’s very difficult to get a perfectly accurate model of the world and where things are. So if you’re going to go manipulate or grasp an object, a small error in that position will maybe have your robot crash into the object, a delicate wine glass, and probably break it. So the perception and the control are both problems.

— Ken Goldberg

In today’s episode, host Luisa Rodriguez speaks to Ken Goldberg — robotics professor at UC Berkeley — about the major research challenges still ahead before robots become broadly integrated into our homes and societies.

They cover:

Why training robots is harder than training large language models like ChatGPT.
The biggest engineering challenges that still remain before robots can be widely useful in the real world.
The sectors where Ken thinks robots will be most useful in the coming decades — like homecare, agriculture, and medicine.
Whether we should be worried about robot labour affecting human employment.
Recent breakthroughs in robotics, and what cutting-edge robots can do today.
Ken’s work as an artist, where he explores the complex relationship between humans and technology.
And plenty more.

Producer: Keiran Harris
Audio engineering: Dominic Armstrong, Ben Cordell, Milo McGuire, and Simon Monsour
Content editing: Luisa Rodriguez, Katy Moore, and Keiran Harris
Transcriptions: Katy Moore

Highlights

Moravec’s paradox

Ken Goldberg: Moravec’s paradox can simply be summed up as: What’s hard for humans, like lifting heavy objects, is easy for robots; and what’s easy for humans, like stacking some blocks on a table or cleaning up after a dinner party, is very hard for robots. That’s the paradox, and it’s been true for 35 years, since Hans Moravec — who’s still around; he’s based in Pittsburgh — observed this, and correctly labelled it as a paradox. Because it is counterintuitive, right? Why should we have this differential? But it’s still true today. So the paradox itself is undeniable, I think.
But why is this paradox is a very interesting question. And you raised one possible explanation about evolution: that humans have had the benefit of millions of years of evolution. We’ve evolved these reactions and these sensory capabilities that are giving us that fundamental stratum, substrate that lets us perform these tasks. And you’re right in that speaking language is much more recent, comparatively.
I can tell you from one perspective: let’s take it from the dimensionality perspective. This means how many degrees of freedom do you have in a system? So a one-dimensional system is you’re just moving along a line, two dimensions is you’re moving in a plane, and then three dimensions is you’re moving in space.
But then for robots, you also have to think about: you have an object in space, and you have its position in space. So here’s my glasses, and they’re in a position in space, but then there’s also their orientation in space. So that’s three more degrees of freedom. So there’s six degrees of freedom from an object moving in space. That’s what we typically talk about with robots. That’s why robots need at least six joints to be able to achieve an arbitrary position and orientation in space. So that’s six degrees of freedom.
But now if you add all the nuances, let’s say, of a hand with fingers, you now have maybe 20 degrees of freedom, right? So each time you do that, you think of a higher-dimensional space, and each one of those is much bigger. It grows exponentially with the number of degrees of freedom. That is an exponential. There’s no doubt about it. That’s a real exponential.
And if you look at language, it’s actually a beautiful example of this, because it’s really linear: it’s one-dimensional. Language is a sequence, a string of words. And the set of words is relatively small: it’s something like 20,000 words are typically used in, say, English. So you’re now looking at combinations of all different combinations of those 20,000 words in a linear sequence. So there’s many — a huge, vast number — but it’s much smaller than, say, the combination of motions that can occur in space. Because those are many infinities greater than the number of sentences that can be put together.

Successes in robotics to date

Luisa Rodriguez: In general, what are the kind of characteristics of tasks that robots have been able to master?
Ken Goldberg: I would start by saying there’s two areas that have been very successful for robots in the past decade. The first is quadrotors — that is, drones. That was a major set of breakthroughs that were really in the hardware and control areas, and that allowed these flying robots, drones, to be able to hover stably in space. And then once you could get them to hover, you could start to control their movement very precisely. And that has been a remarkable set of developments. And now you see spectacular results of this. If you’ve seen some of these drone sky exhibitions — where they’re in formations, moving around in three dimensions — it’s incredible.
Luisa Rodriguez: Yeah, they’re incredible.
Ken Goldberg: Yeah. And that has also been very useful for inspection and for photography. They’re extremely widely used in Hollywood, and just at home sales — typically, drones fly around and give you these aerial views. So that’s been a really big development, and there’s a lot of beautiful technology behind that.
Another one, interestingly, is also quadrupeds, which are four-legged robots. And those are the dogs, like you see from Boston Dynamics, which was really a pioneer there, but now there’s many of those. In fact, there’s a Chinese company that sells one for about $2,000 on eBay, Unitree — and it’s amazing, because they’ve pretty much gotten very similar functionality. It’s not as robust, and it’s smaller and more lightweight, but it has the ability to walk over very complex terrain. You can take it outside and it will climb over rubble and rocks things like that very well. And it just works out of the box.
Luisa Rodriguez: Cool!
Ken Goldberg: That was, again, the result of a number of things: new advances in motors and the hardware, but also in the control. And here, there were a lot of nuances in being able to control the legs, and learning played a key role there. In particular, this technique called model predictive control — which is an older technique, but, combined with deep learning, was able to address that problem and basically get these systems to be able to walk over very complex terrain and adapt, and even jump over things. And you can probably see that they can fall, roll down some stairs, and get up and keep going. So they can roll over, and they can, in some cases, do a backflip, which is really incredible.

Why perception is a big challenge for robotics

Ken Goldberg: Perception is quite difficult with cameras: even if you have a stereo camera, you still can’t really build a map of where everything is in space. It’s just very difficult. And I know that sounds surprising, because humans are very good at this. In fact, even with one eye, we can navigate and we can clear the dinner table.
But it seems that we’re building in a lot of understanding and intuition about what’s happening in the world and where objects are and how they behave. For robots, it’s very difficult to get a perfectly accurate model of the world and where things are. So again, if you’re going to go manipulate or grasp an object, a small error in that position will maybe have your robot crash into the object, a delicate wine glass, and probably break it. So the perception and the control are both problems.
Then the other one that’s more subtle and I think really interesting is in physics. The nuance there is that you can imagine, just take up a pencil or pen, put it in front of you on a flat table, and then just push it with one finger forward. What will happen is it’ll move for a minute, and then it’ll rotate away from your finger as you do it. Now, why does that happen? It turns out that it really depends on the very microscopic details of the surface of your table and the shape of your pencil. Those are essentially making contacts and breaking contacts as you’re moving it along, as you’re pushing it.
The nature of those contacts is impossible to perceive, because they’re underneath the pencil. So you, by looking down, can’t see what’s going on, so you really can’t predict how that pencil is going to move. And this is an example of really an undecidable problem. I mean, it’s unsolvable. And I like to say you don’t have to go into quantum physics to find the very difficult problems that inherently cannot be solved. It’s not a matter of getting better sensors or better physics models, but we just can never predict that. We’ll never be able to, because it depends on these conditions which we can’t perceive.
So we have to find ways to compensate for all three of these factors: the control errors, the perception errors, and the physics errors, I call it. And humans do it. We have an existence proof. We know it can be done. And humans have the same limitations. So how do we do it? That’s the million-dollar question, or billion-dollar question.
Luisa Rodriguez: Yeah. No kidding. Well, that made me want to ask: you said it’s an unsolvable problem, and maybe I’m just kind of misunderstanding, but do you mean that it’s unsolvable with the perception we have now for robotics? Because I don’t know exactly where the pencil I push on the table is going to end up, but I do manipulate pencils and make pretty good predictions about what would happen if I nudged it forward.
Ken Goldberg: Yeah. So you do. But what I mean is that you cannot predict exactly where that pencil is going to move if you start to push it.
Luisa Rodriguez: So humans can’t do it as well, this thing that you’re talking about?
Ken Goldberg: Right, right. No one can. I’m saying that no model in the universe can solve this problem, because it depends on what’s happening at the really almost submicroscopic level that is basically going to influence how that pencil responds to being pushed. It’s friction, but in a very nuanced way. And in friction, we have this model we all learn in college or in high school called Coulomb friction. It’s a reasonable approximation, but it’s a rough approximation to the real world, and it doesn’t really describe what happens when you push an object across a surface, how it’s going to move. So this is known to be a very nuanced and subtle problem, and it’s right in front of us.
And you’re right: we solve it. “Solve it” means not to predict it exactly; what we do is we compensate for the error by the scooping motion, where we move our fingers in a way around the object. And we have a word for that: we call it “caging.” Caging is where you put your fingers around an object in such a way that it is caged, meaning that it can’t escape. It can move around, it can rattle around inside that cage, but it can’t get out of the cage. Once you put your fingers in a cage around the object, now you start to close your fingers, there’s nowhere for the object to go, so it generally will end up in your [hand]. You’ll be able to pick it up.

Why low fault tolerance makes some skills extra hard to automate

Luisa Rodriguez: Why do you think we won’t have fully autonomous robot surgeons in the next 30 or 40 years?
Ken Goldberg: The issue here is fault tolerance. I’m glad you brought it up, because this is why self-driving cars are particularly complicated and challenging, because they’re prone to a small fault. A small error could be quite disastrous, as you know. If you’re driving on a cliff, small error and you go over the side. Or you bump into a stroller, run over a kid. So driving is very challenging because of that, in contrast to logistics — because in logistics, if you drop a package, it’s no big deal. In fact, it happens all the time; they expect it to happen a fairly large amount of time. So if something like 1% of packages get dropped, it’s OK, that’s not a big deal. You can live with it.
But driving is not very fault tolerant; in surgery, even less so. You have to be really careful because you don’t want to puncture an organ or something, or sew two things together that shouldn’t be sewn together, right? So there’s a huge consequence.
The other thing is perception. Because inside the body, it’s very challenging, because oftentimes there’s blood; or if it’s a laparoscopic surgery, you’re constantly essentially trying to avoid the blood and staunch out the blood so that you can see what’s going on.
And this is where, just as you were describing watching someone crack an egg, surgeons have developed this really good intuition — because they know what the organs are, they know what they should look like, how they’re positioned, and how, let’s say, thick or rough, or what their surfaces and their materials are.
So they have very good intuition behind that, so they can operate. Sometimes you cut a blood vessel and the whole volume fills with blood, and now you have to find that blood vessel and clamp it, so that you can stop the blood. And that’s like reaching into a sink filled with murky water and finding the thing, right? Surgeons are very good at that, and it’s a lot of nuance.
So the perception problem is extremely difficult, because everything is deformable. Deformable materials are particularly difficult for robots. We talked about cracking an egg or clearing a dinner table: generally, all those objects are rigid. But when you start getting into deformable things — like cables or fabrics or bags, or a human body, right now — all of a sudden, everything is just bending and movable in very complex ways. And that’s very hard to model, simulate, or perceive.
Luisa Rodriguez: Right. Yeah, I’m just finding it fascinating how the categories of things that are really troublesome, thorny problems for robots are just not what I’d expect. I mean, the fact that we’re making progress on suturing, but it gets really complicated as soon as an organ… You know, you could move it and it’s hard to predict how it’s going to look when it moves or where it’s going to be. It is just unexpected and really interesting.
Ken Goldberg: Absolutely. And as you’re saying this, I’m thinking, going back to the kitchen — you know, kitchen workers in restaurants — there’s so much nuance going on there, if you’re chopping vegetables or you’re unpacking things. Let’s say every single chicken breast is slightly different. So being able to manipulate those kinds of things, and then clean surfaces, and wipe things, and stir — there’s so many complex nuances.
So I think it’s going to be a long time before we have really fully automated kitchen systems. And the same is true for plumbers, carpenters, and electricians. Anyone who’s basically doing these kinds of manual tasks, fixing a car, they require a vast amount of nuance. So those jobs are going to be very difficult to automate.

How might robot labour affect the job market?

Ken Goldberg: So here’s where I also think there’s a huge amount of fear out there, and it’s really important to reassure workers that this is not imminent and that what they do is very valuable and safe from automation. I think everyone’s been saying for years that we’re going to have lights-out factories, and that humans will just sit around and have all this leisure time — maybe even like Wall-E or something, right?
But no, we’re very far from that. The fact is that there’s so many nuances to what humans do in jobs, and AI is a whole realm of office workers and what they do. And I think that there’s certainly many aspects of jobs that can be automated. For example, transcribing this interview is a perfect example: in the past it had to be done by a human; now you’ve got a machine to do most of it, then you tune it. You still have to fine-tune it, but you get a lot of the basic elements done. We have a lot of tools that make certain aspects of our job, but then, because every other aspect of our job needs more attention than we have, we can spend time on that.
The same is true for so many of these things we’re talking about. So gardeners, et cetera, you know, there’ll still be a need for the gardener to be doing the more subtle things, but maybe some machine will be out there doing the lawn. We’re getting closer to automated lawn mowers. And for workers, I think certain things in the kitchen may be automated. And obviously, we have dishwashers, we have certain automation already we’ve had for a long time. But that doesn’t mean we don’t need human workers. So I think we’re going to need them for things that are much more subtle.
So I’m sort of optimistic about the job market. I think that because of this demographic thing, that’s the biggest factor: we have a shortage of human workers, people who are of work age, so I don’t think there’s going to be this kind of unemployment that people are talking about or fearing for the foreseeable future.
Luisa Rodriguez: Yeah. If in, let’s say, 10 or 20 years, robots are super widespread and are causing real job displacement, what do you think played out differently to how you expected?
Ken Goldberg: OK, so let’s say one of these breakthroughs we’re talking about happens, and all of a sudden then robots are capable of learning from YouTube videos and repeating anything they watch. Or maybe you demonstrate, this is how I want to chop these vegetables, and now it’s able to repeat that reliably. So if we got those breakthroughs, then you could imagine that you’d have these robots. Another factor we haven’t even touched on, which is not as interesting, is just the cost, the financials of doing these. But let’s say that gets finessed, too.
So now all of a sudden, you have these robots, and they’re actually pretty capable, and we’re seeing them increasingly being put to use and actually doing something useful. Then I think it will be interesting. I think that would change our perceptions of them. My own sense is that we would find new work for humans to do, that we would basically shift toward other things that are more subtle, let’s say maybe it’s healthcare and things like that. We have a shortage of humans that can do those things, and also teaching and childcare. And there’s a lot of things where we are just still shorthanded. So I think that people will find new jobs, but some of these things might be automated.
I guess the extreme form of this is that you have a robot that can do anything that a human can do and you just have them doing it all. And then what? We hang out, we can spend time playing music and writing poetry and doing all the fun stuff. And it’s an interesting prospect. Maybe we’ll drive ourselves crazy because we’ll have so much free time, you know?