Thudm/AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agentsgithub.com/THUDM1 pointfreediver3 years ago