Heng Yang (@hankyang94)
2024-10-14 | ❤️ 118 | 🔁 23
Robots in our group are finally moving! Excited to share the first RobotLearning project from my group led by Han Qi and @hcy1n:
Control-oriented Clustering of Visual Latent Representation
In classical robotics, “perception” and “control” are separated with an explicit state representation.
In end-to-end vision-based control, what’s going on in the visual latent representation space?
-
Focusing on the planar pushing task, we found an elegant geometric structure in the visual latent space: visual features are clustered according to “control-oriented” goals. Moreover, such clustering has a rigid structure known as “neural collapse” (NC, first found in the image classification literature).
-
How does this understanding help? We show that pretraining the vision encoder with control-oriented clustering significantly increases test-time performance in the low-data regime (e.g., 50 or 100 demos). For example, with 100 demos, baseline succeeds 5 out of 10, but NC-pretrained policy succeeds 8 out of 10.
Check out more about the project: https://computationalrobotics.seas.harvard.edu/ControlOriented_NC/
미디어
![]()