Last week I benchmarked my open-source Deep Research tool against ChatGPT, Perplexity, and Gemini. I caught OpenAI fabricating 4-5 citations that don't exist. Gemini reduced real hazard ratios by 30-40%.
So I built ASK Mode: every answer gets automatically verified against a second round of sources. Each claim marked [OK], [??], or [NO].
- ~400 verified answers for $1 - 2-3 minutes per query - No RLHF nannying - it answers what you ask - Full verification report with every response
The gap between chat (unverified, stale training data) and deep research (20+ minutes) needed filling.
Benchmark proof: https://veritas--test-neocities-org.translate.goog/?_x_tr_sl... GitHub: https://github.com/IamLumae/Project-Lutum-Veritas