Search: github.com/judge0 | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

31.

Show HN: GEDD – A Systematic Evidence Driven LLM as a Judge Framework

github.com/aws-samples

9 days ago

2 points

32.

Show HN: CoJudge – open-source, offline judge for studying LC-style problems

github.com/cojudge

8 months ago

2 points

33.

Evaluating Large Language Models Using LLM-as-a-Judge

github.com/aws-samples

2 years ago

2 points

34.

Coderunner – A judge for your programs,run and test your programs through Python

github.com/codeclassroom

7 years ago

2 points

35.

Show HN: A command line interface to UVA online judge (competitive programming)

github.com/scvalencia

10 years ago

2 points

36.

Show HN: Claude-relais – A plan/build/judge loop mixing Claude with Cursor

github.com/clementrog

4 months ago

1 points

37.

Precision-Based Sampling of LLM Judges

a year ago

1 points

38.

Show HN: Lone Arena – Self-hosted LLM human evaluation, you be the judge

github.com/Contextualist

2 years ago

1 points

39.

Collection of TypeScript type challenges with online judge

github.com/type-challenges

2 years ago

1 points

40.

Show HN: A self hosted online judge for meetups and workshops, written in Go

github.com/MohamedBassem

9 years ago

1 points

41.

Show HN: Minimal, self-hosted exercise tracker

github.com/bmtwl

a year ago

127 points

42.

Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL

github.com/Danau5tin

a year ago

125 points

43.

Launch HN: Confident AI (YC W25) – Open-source evaluation framework for LLM apps

a year ago

117 points

44.

Show HN: SirixDB – Bitemporal binary JSON database system and event store

github.com/sirixdb

3 years ago

109 points

45.

Launch HN: Traceloop (YC W23) – Detecting LLM Hallucinations with OpenTelemetry

2 years ago

101 points

46.

Show HN: Index – New Open Source browser agent

github.com/lmnr-ai

a year ago

98 points

47.

Show HN: RULER – Easily apply RL to any agent

a year ago

81 points

48.

Show HN: Torrix, self hosted, LLM Observability,(no Postgres, no Redis)

github.com/torrix-ai

a month ago

74 points

49.

Show HN: OCR Benchmark Focusing on Automation

a year ago

58 points

50.

Show HN: TensorZero – open-source data and learning flywheel for LLMs

github.com/tensorzero

2 years ago

GabrielBianconi

49 points

51.

Show HN: Helicone (YC W23) – OSS LLM Observability and Development Platform

github.com/Helicone

a year ago

29 points

52.

Show HN: Create LLM graders and run evals in JavaScript with one file

github.com/bolt-foundry

a year ago

28 points

53.

Show HN: OSS sustain guard – Sustainability signals for OSS dependencies

onukura.github.io

6 months ago

21 points

54.

Show HN: Anytype – a local and collaborative database with API and MCP server

a year ago

20 points

55.

Show HN: I built an open-source AI data layer that connects any LLM to any data

github.com/bagofwords1

9 months ago

18 points

56.

Show HN: TinyFish Web Agent (82% on hard tasks vs. Operator's 43%)

4 months ago

17 points

57.

Show HN: Meta-agent: self-improving agent harnesses from live traces

github.com/canvas-org

3 months ago

14 points

58.

Show HN: Ebiose – A Darwin‑Style Playground for Self‑Evolving AI Agents

github.com/ebiose-ai

a year ago

12 points

59.

Show HN: OpenTiger – Autonomous dev orchestration that never stops

github.com/Andyyyy64

4 months ago

11 points

60.

Show HN: Kiln – AI Boilerplate with Evals, Fine-Tuning, Synthetic Data, and Git

github.com/Kiln-AI

a year ago

10 points