HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
1.
▲
Replace OCR with Vision Language Models
github.com/vlm-run
125 comments
a year ago
EarlyOom
292 points
2.
▲
Run structured extraction on documents/images locally with Ollama and Pydantic
github.com/vlm-run
29 comments
a year ago
EarlyOom
170 points
3.
▲
A Node.js SDK for calling Vision Language Models
github.com/vlm-run
discuss
a year ago
EarlyOom
6 points
4.
▲
Unified Vision-Language Agents – Detect, Segment, OCR, Generate and More
github.com/vlm-run
1 comment
7 months ago
fzysingularity
5 points
5.
▲
Show HN: Visually parse an entire YouTube video frame by frame
github.com/vlm-run
discuss
a year ago
EarlyOom
5 points
6.
▲
Vlms-zero-to-hero: readings from the fundamentals to the cutting edge of VLMS
github.com/SkalskiP
discuss
a year ago
swyx
2 points
7.
▲
Experimental Optical Encoder for Qwen3-VLM-2B-Instruct
github.com/Volkopat
1 comment
8 months ago
volkopat2
1 points
8.
▲
Asn1c: The Lionet ASN.1 Compiler
github.com/vlm
discuss
8 months ago
fanf2
1 points
9.
▲
Show HN: r1_vlm – Open-Source Framework for Visual Reasoning with GRPO
github.com/groundlight
8 comments
a year ago
skumar17
5 points
10.
▲
Mlx-VLM: Fast Local VLMs and Omni Models on Apple Silicon with MLX
github.com/Blaizzy
discuss
3 months ago
salkahfi
2 points
11.
▲
Show HN: I achieved over 10% improvement on 3D vision PointCLIP
github.com/genji970
discuss
a year ago
genji970
2 points
12.
▲
Show HN: 2500 vision benchmarks / evals for Vision Language Models
github.com/Overshoot-ai
discuss
2 months ago
zakariaelhjouji
1 points
13.
▲
Show HN: Vlm in 3D PC, 16 shot scanobjectnn top1 acc: 99.91
github.com/genji970
discuss
a year ago
genji970
1 points
14.
▲
Show HN: VLMs Can Respond Twice as Fast Without Losing Quality
github.com/sergey-automation
1 comment
2 days ago
trykhlieb
2 points
15.
▲
Super fast and accurate image classification on edge devices
github.com/Paulescu
discuss
9 months ago
PauLabartaBajo
1 points
16.
▲
Show HN: Benchmarking VLMs vs. Traditional OCR
getomni.ai
40 comments
a year ago
themanmaran
146 points
17.
▲
Show HN: LoongForge-A high-performance training framework for LLM, VLM, VLA, Wan
github.com/baidu-baige
2 comments
a month ago
mindzzz
10 points
18.
▲
Show HN: Cursed Browser – a VLM reads the HTML and hallucinates the page
github.com/scosman
1 comment
a month ago
scosman
7 points
19.
▲
Cursed_browser: Web browser with a VLM as rendering engine
github.com/scosman
discuss
a month ago
misterdata
4 points
20.
▲
SketchVLM: Letting VLMs draw on images while explaining their reasoning
github.com/Brandon-Collins7
1 comment
2 months ago
taesiri
3 points
21.
▲
Show HN: Unsiloed Chunker – VLM powered semantic chunking for RAG
github.com/Unsiloed-AI
discuss
a year ago
unsiloed-ai
3 points
22.
▲
LoongForge-A high-performance training framework for LLM, VLM, DIT, VLA models
github.com/baidu-baige
discuss
a month ago
mindzzz
2 points
23.
▲
Show HN: Vision AI Checkup, an Optometrist for VLMs
visioncheckup.com
discuss
a year ago
zerojames
2 points
24.
▲
The simplest, fastest repository for training/finetuning small-sized VLMs
github.com/huggingface
discuss
a year ago
s-macke
2 points
25.
▲
Advanced Quantization Algorithm for LLMs/VLMs
github.com/intel
discuss
a year ago
XnoiVeX
2 points
26.
▲
Show HN: LLM / VLM language agent implementations
github.com/arthurcolle
discuss
a year ago
arthurcolle
2 points
27.
▲
Show HN: A VLM-powered image search engine built with Ruby on Rails
github.com/neonwatty
discuss
a year ago
neonwatty
2 points
28.
▲
Show HN: A/B test your own VLMs for document parsing (Self-hosted Arena)
github.com/Bae-ChangHyun
discuss
4 months ago
matthew624
1 points
29.
▲
Show HN: Offline AI Photo Search (local VLM and semantic search)
github.com/Pankaj4152
discuss
7 months ago
Pankaj4152
1 points
30.
▲
"Captions With Attitude" in the browser from local VLM using llama.cpp in Go
github.com/hybridgroup
discuss
7 months ago
deadprogram
1 points
More