Agent-evals: Overlap, boundary, and metacognitive scoring for coding agentsthinkwright.ai1 pointoceanwaves4 months ago