The editor of Downcodes learned that NVIDIA's research team has made a major breakthrough and developed a new neural network called HOVER (Humanoid Multi-Function Controller). This neural network only has 1.5 million parameters, but it can efficiently coordinate the movement and operation of humanoid robots. Its efficient training method and powerful functions are eye-catching. The emergence of HOVER marks a big step forward in humanoid robot control technology and provides new possibilities for the development of future robot technology.
Jim Fan, senior research manager at NVIDIA, said: "Not all basic models need to be huge. The 1.5M parameter neural network we trained is designed to control the body of a humanoid robot." He further explained that HOVER can capture human movements subconscious processes so that robots can perform complex tasks without cumbersome programming. He mentioned, "Humans require a lot of subconscious processing when they walk, maintain balance, and flexibly control their limbs."
During the training process, HOVER used NVIDIA's Isaac simulation platform, which can accelerate physical simulation 10,000 times faster than real time.
Jim Fan revealed that this model took a year to train in a virtual environment and actually only took about 50 minutes of real time, which was completed on a single GPU. He said that this efficient training allows the neural network to be smoothly transferred to real-world applications without the need for fine-tuning.
HOVER has the ability to respond to a variety of high-level motion commands, including using XR devices (such as Apple’s Vision Pro) for head and hand posture control, or obtaining full-body posture through motion capture and RGB cameras, and even obtaining joints from exoskeletons. angle, or get the root velocity command from the joystick. Fan emphasized that HOVER provides a unified interface for robots controlling different input devices, thereby facilitating the collection of teleoperation data for training.
In addition, HOVER is integrated with the upstream visual-language-action model, allowing movement commands to be converted into low-level motor signals at high frequency. This model is compatible with any humanoid robot that can be simulated in Isaac, allowing users to easily bring the robot to life.
As early as the beginning of this year, NVIDIA also announced a project called GR00T, which is a general base model designed for humanoid robots. Robots driven by GR00T (Generalist Robot00Technology) can understand natural language and imitate human movements by observing movements, allowing them to quickly learn coordination, flexibility and other skills needed to interact effectively in the real world.
Paper URL: https://arxiv.org/pdf/2410.21229
The emergence of HOVER has brought new hope to the field of humanoid robot control. Its efficient training methods and powerful functions indicate that future robot technology will be more intelligent and humane. This technological breakthrough will greatly promote the application of humanoid robots in various fields. We look forward to more exciting developments in the future!