Stitching Vision Encoders into LLMs: Clip vs. I-JEPA vs. ViT Comparisonteendifferent.substack.com2 pointsteendifferent4 months ago