Can Large Vision-Language Models Correct Grounding Errors By Themselves?

Publication
CVPR 2025