I built an MCP (Model Context Protocol) server that wraps the Grok CLI instead of calling the API directly. It provides three tools for Claude Code and other MCP clients: simple queries, multi-turn chat, and code generation. This allows me to use Grok models in Claude Code.
Why wrap the CLI instead of using the API?
The CLI wrapper approach has some interesting benefits:
• Future-proof: When xAI adds OAuth to the CLI, this gets it automatically without code changes
• Pricing flexibility: If xAI introduces fixed monthly pricing for CLI usage (like OpenAI's $20/month for Codex), you benefit immediately rather than paying per token
• Simpler maintenance: ~400 lines vs 1500+ for a full API client. No tracking API versioning or breaking changes.
• Org-friendly: Some teams prefer audited CLI tools over API libraries for security/compliance
Honest tradeoffs: • Performance overhead: ~50-200ms extra latency from process spawning
• CLI dependency: Requires Grok CLI installed
• Limited control: Can't access features not exposed by CLI
The sweet spot is development tools, prototyping, and internal automation where convenience matters more than milliseconds. For high-throughput production (>1000 req/min), direct API makes more sense.Built with FastMCP, fully open source (MIT), and published to PyPI.
GitHub: https://github.com/BasisSetVentures/grok-cli-mcp
Curious to hear if others have taken similar "wrapper over direct integration" approaches for other AI CLIs.