Mirror, Mirror on the Wall: Can VLM Agents Tell Who They Are at All?

arXiv:2605.08816v1 Announce Type: new Abstract: In the animal kingdom, mirror self-recognition is a canonical probe of higher-order cognition, emerging only in some species. We ask whether an analogous functional capability emerges in embodied vision-language model (VLM) agents: can they recognize themselves in a mirror? We introduce a controlled 3D benchmark where a first-person VLM agent must infer a hidden body attribute from its reflection and select the matching target, while avoiding…

cs.AI updates on arXiv.org · May 12 · 1 min read · score 7.0

From the source