HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
31.
▲
Show HN: GEDD – A Systematic Evidence Driven LLM as a Judge Framework
github.com/aws-samples
discuss
9 days ago
balasvce2026
2 points
32.
▲
Show HN: CoJudge – open-source, offline judge for studying LC-style problems
github.com/cojudge
discuss
8 months ago
ansliy
2 points
33.
▲
Evaluating Large Language Models Using LLM-as-a-Judge
github.com/aws-samples
discuss
2 years ago
mooreds
2 points
34.
▲
Coderunner – A judge for your programs,run and test your programs through Python
github.com/codeclassroom
discuss
7 years ago
bhupesh
2 points
35.
▲
Show HN: A command line interface to UVA online judge (competitive programming)
github.com/scvalencia
discuss
10 years ago
scvalencia
2 points
36.
▲
Show HN: Claude-relais – A plan/build/judge loop mixing Claude with Cursor
github.com/clementrog
discuss
4 months ago
crog
1 points
37.
▲
Precision-Based Sampling of LLM Judges
sunnybak.net
discuss
a year ago
sunny-bak
1 points
38.
▲
Show HN: Lone Arena – Self-hosted LLM human evaluation, you be the judge
github.com/Contextualist
discuss
2 years ago
Contextualist
1 points
39.
▲
Collection of TypeScript type challenges with online judge
github.com/type-challenges
discuss
2 years ago
max-m
1 points
40.
▲
Show HN: A self hosted online judge for meetups and workshops, written in Go
github.com/MohamedBassem
discuss
9 years ago
mohamedbassem
1 points
41.
▲
Show HN: Minimal, self-hosted exercise tracker
github.com/bmtwl
39 comments
a year ago
DrPhish
127 points
42.
▲
Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL
github.com/Danau5tin
12 comments
a year ago
Danau5tin
125 points
43.
▲
Launch HN: Confident AI (YC W25) – Open-source evaluation framework for LLM apps
27 comments
a year ago
jeffreyip
117 points
44.
▲
Show HN: SirixDB – Bitemporal binary JSON database system and event store
github.com/sirixdb
16 comments
3 years ago
lichtenberger
109 points
45.
▲
Launch HN: Traceloop (YC W23) – Detecting LLM Hallucinations with OpenTelemetry
72 comments
2 years ago
GalKlm
101 points
46.
▲
Show HN: Index – New Open Source browser agent
github.com/lmnr-ai
45 comments
a year ago
skull8888888
98 points
47.
▲
Show HN: RULER – Easily apply RL to any agent
openpipe.ai
11 comments
a year ago
kcorbitt
81 points
48.
▲
Show HN: Torrix, self hosted, LLM Observability,(no Postgres, no Redis)
github.com/torrix-ai
4 comments
a month ago
AdarshRao23
74 points
49.
▲
Show HN: OCR Benchmark Focusing on Automation
nanonets.com
21 comments
a year ago
prats226
58 points
50.
▲
Show HN: TensorZero – open-source data and learning flywheel for LLMs
github.com/tensorzero
2 comments
2 years ago
GabrielBianconi
49 points
51.
▲
Show HN: Helicone (YC W23) – OSS LLM Observability and Development Platform
github.com/Helicone
7 comments
a year ago
justintorre75
29 points
52.
▲
Show HN: Create LLM graders and run evals in JavaScript with one file
github.com/bolt-foundry
2 comments
a year ago
randall
28 points
53.
▲
Show HN: OSS sustain guard – Sustainability signals for OSS dependencies
onukura.github.io
6 comments
6 months ago
onukura
21 points
54.
▲
Show HN: Anytype – a local and collaborative database with API and MCP server
zhanna.any.org
discuss
a year ago
sharipova
20 points
55.
▲
Show HN: I built an open-source AI data layer that connects any LLM to any data
github.com/bagofwords1
3 comments
9 months ago
y14
18 points
56.
▲
Show HN: TinyFish Web Agent (82% on hard tasks vs. Operator's 43%)
tinyfish.ai
12 comments
4 months ago
gargi_tinyfish
17 points
57.
▲
Show HN: Meta-agent: self-improving agent harnesses from live traces
github.com/canvas-org
discuss
3 months ago
essamsleiman
14 points
58.
▲
Show HN: Ebiose – A Darwin‑Style Playground for Self‑Evolving AI Agents
github.com/ebiose-ai
3 comments
a year ago
vincent-ebiose
12 points
59.
▲
Show HN: OpenTiger – Autonomous dev orchestration that never stops
github.com/Andyyyy64
2 comments
4 months ago
andyyyy64
11 points
60.
▲
Show HN: Kiln – AI Boilerplate with Evals, Fine-Tuning, Synthetic Data, and Git
github.com/Kiln-AI
1 comment
a year ago
scosman
10 points
More