That is not true. The routine is preprogrammed, but there is adaptability. If there wasn't they would fall on the ground in the first 5 seconds. The movement involved in the routine we saw requires continuous adjustment. You can't just record the movement as you would with a video game animation, real physics get in the way and you end up on your back on the ground trying to do a jump and a backflip.
If you think I am wrong, sure I could be but have a look at atlas, https://www.youtube.com/watch?v=oe1dke3Cf7I
The robots motion is not preprogramed at all, see how much more smooth the motion is?
Thats because boston dynamics are using an approach where they try to calculate and take the dynamics of motion into account, just like Unitree.
The kawasaki approach is clearly to use overwhelming torques in an effort to cancel all the dymanics and produce fully controlled movement. Exactly what an old man does as well or a robotic arm in a factory. It's honestly embarrassing it looks like kawasaki has no progress in the last 30 years their robots still move like its 1996.
Have a look here https://underactuated.csail.mit.edu/intro.html for a more indepth explanation of the difference between the two approaches.
There are two main ways to accomplish what the kung-fu robot does.
First you train a reinforcement learning policy for balancing and walking and a bunch of dynamic movements, then you record the movement you want to perform using motion capture, then you play back the recorded trajectory.
Second, you train a reinforcement learning policy for balancing and walking, but also bake in the recorded movement into the policy directly.
Okay, I lied. There is also a third way. You can use model predictive control and build a balancing objective by hand and then replay the recorded trajectory, but I think this method won't be as successful for the shown choreography however it's what Boston dynamics did for a long time.
In both cases you will still be limited to a pre-recorded task description. Is this really that hard to understand? Do you really think someone taught the robot in Chinese language and by performing the movement in front of the camera of the humanoid how to perform the choreography like a real human or that the robot came up with the choreography on its own? Because that's the conclusion you have to draw if you deny the two methods I described above.
What you do is you map the dynamics of your system, and solve them, that solution is a program that can produce torque inputs in joints to move the system in the way that you want.
You then create a sequence of desirable intermediate and end states. The program then does it best to achieve these.
The difference between atlas and the kawasaki robots, is that to achieve those states, the kawasaki robots use a program that attempts to stop all inertial rotations and movements in order to maintain full control of it's movements at all times.
While atlas and the chinese robots leverage the inertia and gravity to achieve their movements, again you do that by solving a large set of equations, no ML required.
The GP described a system of prerecorded motions, like a video game animation, if you try to do that, and have no controller to adjust to the real time environment, you are just going to tip over and continue doing the prerecorded motions. We saw that with the Russian robot last year.
You can use a real human that does the choreography as a way for capturing the desired intermediate states that is the step that might require ML.
I think this might no longer be true. I don't think this years dance routine would have been possible without RL given how crappy robots were 2-3 years ago.
The actual concern here is that there are too many cuts. If the whole table movement sequence was uncut and fully autonomous, that would mean they have the most advanced humanoid robot software in the world.
It means they can autonomously find the correct grasping location on the table for both arms, meaning the robot needs to have a model of the table. The robot needs to know at what height to hold the table to keep the table level and compensate for the human pulling on the object while balancing and autonomously following the direction the human is pulling in.
Of course, since there were many cuts, we don't really know whether that's true. We also don't know if teleoperation is involved or not.
The Chinese robot dancing is cool, because it shows what the hardware is capable of, but it doesn't really show anything on the software side. Contacts with objects are hard in robotics and the kung-fu choreography avoids them for obvious reasons.
What remains is: all those quirky little one-off processes that aren't very amenable to "robot arm" automation, aren't worth the process design effort to make them amenable to it, and are currently solved by human labor.
Thus, you design new solutions to target that open niche.
Humans aren't perfect at anything, but they are passable at everything. Universal worker robots attempt to replicate that.
"A drop-in replacement for simple human labor" is a very lucrative thing, assuming one could pull it off. And that favors humanoid hulls.
Not that it's the form that's the bottleneck for that, not really. The problem of universal robots is fundamentally an AI problem. Today, we could build a humanoid body that could mechanically perform over 90% of all industrial tasks performed by humans, but not the AI that would actually make it do it.
The success of large multipurpose AI models trained on web-scale data pushed a lot of people towards "cracking general purpose robot AI might be possible within a decade".
Whether transfer learning from human VR/teleop data is the best way to do it remains uncertain - there are many approaches towards training and data collection. Although transfer learning from web-scale data, teleoperation and "RL IRL" are common - usually on different ends of the training pipeline.
Tesla got the memo earlier than most, because Musk is a mad bleeding edge technology demon, but many others followed shortly before or during the public 2022 AI boom.
And that reframes the economics entirely. You don't need the robot to be better than a human at any given task. You need the total cost of ownership to be lower than a salary, benefits, turnover, and training. That's a much easier bar to clear once the AI catches up to the body.
The interesting question is whether the AI problem gets solved generally (one model that can do everything) or whether we end up with task-specific AI in a general-purpose body — basically the robot arm paradigm wearing a humanoid suit.
If you can get 5 specialist models that can use the same robot body, you can also get 1 generalist model with more capacity and fold the specialists into it. If you have the in-house training that made those specialists, apply them to the generalist instead, the way we give general purpose AIs coding-specific training. If you don't, take the specialists as is and distill from them.
If you do it right, transfer learning might even give you a model that generalizes better and beats the specialists at their own game. Because your "special" tasks have partial subtask overlap that you got stronger training for, and contributed to diversity of environments. Robotics AI is training data starved as a rule.
Same kind of lesson we learned with LLM specialists - invest into a specialist model and watch the next gen generalists with better data and training crush it.
What about doing dishes? That could be done with one arm. Maybe not easy and economical yet, but could be.
There is plenty that has not been seen through.
Laundry folding machines are not in wide distribution.
Robots to put away laundry?
Etc. lots of mundane tasks.
It's the flexibility and adaptability with minimum training that's required.
https://youtu.be/SRZ9E48B6aM?si=K_wwvu97agBZpFTa