Turns out the only way you find out currently if you're cited at scale is if you reverse engineer the prompts users might be asking about your brand and track ChatGPT/Gemini/etc. responses for them.
I've built a pipeline, that:
1. crawls your website and your competition
2. analyzes top search keywords from search engines
3. retrieves similar conversations from WildChat (~1M ChatGPT conversations)
4. generates most likely user prompts for your brand.
The result is a set of prompts that users are likely to ask about your company or your products and tracking them shows where LLMs suggest you and where do they reference your competition.
Repo:
https://github.com/syntropicsignal-ai/ai-visibility-audit
I'd love feedback on:
- whether this methodology makes sense
- alternative datasets to WildChat
- better ways to estimate prompt distributions