Abstract
HomeRobot (noun): An affordable compliant robot that navigates homes and manipulates a wide range of objects in order to complete everyday tasks.
Open-Vocabulary Mobile Manipulation (OVMM) is a core challenge for robotics research because it involves bringing together four key capabilities: perception, language understanding, navigation, and manipulation, all of which will be necessary for robots to be useful assistants in human environments. OVMM is a foundational challenge for generally useful robots precisely because it requires tackling and integrating all of these components. To drive research in this area, we introduce the HomeRobot OVMM challenge, where an agent navigates household environments to grasp novel objects and place them on target receptacles. HomeRobot has two components: a simulation component, which uses a large and diverse curated object set in new, high-quality multi-room home environments; and a real-world component, where we provide a software stack for the low-cost Hello Robot Stretch to encourage duplication of real-world experiments across labs. We implement both a reinforcement learning and a heuristic (model-based) baseline and show evidence of sim-to-real transfer.
Real-world success cases
Sim success cases
Analysis: Comparing baselines
– Perception: Ground-truth vs. DETIC
Instruction: “Move the cell phone from the chest of drawers to the counter panel.”
Conclusion: As expected, ccess to GT semantics improves results.
– Finding object: RL vs. Heuristic policy
Instruction: “Move the teapot from the cabinet to the chair”
Conclusion: RL seems to be doing better at finding object.