By Matthew Loper, Ph.D. Student
As a roboticist, the most frequently asked question I hear is, “where is my robot?” People are fascinated by the humanoid mechanical golems that roam across their TV screens, and to them, it’s my duty to provide one.
I could answer their question in three ways. I could tell them to buy a robotic vacuum, which cleans floors and has no human interaction. I could suggest a hobbyist kit that burdens them with implementing (or inventing!) vision and localization algorithms. Or, if pressed, I could admit that their robot is still in a research laboratory.
And that’s the real problem: there is a large gap between commercially-available robots and those in research settings. Robots in the lab can recognize faces, follow people, spot gestures, parse speech, and learn simple tasks from demonstration. Robots in the home are toys by comparison. The problem is not a lack of effort; companies like iRobot (maker of the Roomba robotic vacuum) and Ugobe (maker of the Pleo toy) are on the forefront of consumer robotics and researchers would like nothing better than to see their creations in every home. But the gap remains and it’s worth asking why.
The issue with smart robots designed for the lab is that many can’t work anywhere else. We need robotic solutions that follow the write-once, run-anywhere principles of the Java programming language. But, instead of having an abstraction layer that is software-to-software, we need one that is software-to-world. Both perception and manipulation have to be robust to changing environments. It’s a daunting task for any roboticist.
As part of the Brown Robotics group, I’ve recently had the privilege of working with iRobot on this problem. Their “Packbot” platform can run inside or outside, and can even climb stairs—it is physically robust—but has traditionally been remote-controlled (a.k.a “teleoperated”). We want to move away from teleoperation, towards hands-free teamwork between a person and a robot. For us, this meant building perception mechanisms that allow person-following, gesture recognition, and voice-based operation under varying environmental conditions (such as lighting and terrain).
Good perception starts at the sensors, and we chose to do away with one of the most popular sensors in robotics: the color camera. Any camera fan is familiar with the many issues of white balance, F-stops, and unwieldy specular highlights. You are at the mercy of the lighting available to you—unless you control the light yourself.
Our chosen sensor does exactly that. By projecting non-visible (to the human eye) light on a scene and measuring the phase of returned light, a CSEM SwissRanger gives us a distance (instead of a color) at every pixel of our image. Rather than getting something like a color photograph, we get something more like a 3D relief. This frees us of the dependency on color.
An important part of our work is the way in which we balance sensor usage (speech and gestures should be used according to context).Speech works well when a person isn’t in direct line-of-sight, but fails when the robot is moving (due to motor noise). Gestures work better for managing the following behavior of the robot. Using sensors according to their strengths is essential for effective interaction.
Another fundamental part of our work is rooted in the details of our recognition pipeline. After segmenting our depth image into blobs and classifying some blobs as human silhouettes, we fit a body model to these silhouettes. By looking at these body model poses over time, we can infer gestures. As with much research, the devil is in the details.
But the most exciting thing for users is that the system as a whole works indoors and outdoors (mostly—direct sunlight is a problem), in hallways and open spaces, and even in complete darkness. We’ve worked hard to bring that robot out of the lab, so that eventually its grandchild might make it into your living room. I don’t have a good answer to the question, “where is my robot?” but maybe the right question isn’t “where?” but “when?” We’re working on that.
(Work done in collaboration with Nate Koenig, Sonia Chernova, Chad Jenkins and Chris Jones)