V*: Guided Visual Search as a Core Mechanism in Multimodal LLMsgithub.com/penghao-wu2 pointssaeedesmaili2 years ago