Search: github.com/b1nc | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

991.

Show HN: Sigma Runtime – 550-cycle identity stability benchmark on GPT-5.2

github.com/sigmastratum

6 months ago

2 points

992.

Benchmarking LLMs on whether they can play FizzBuzz

github.com/venkatasg

6 months ago

2 points

993.

Running a 270M LLM on Android (architecture and benchmarks)

7 months ago

2 points

994.

TypeNet Benchmark for development of authentication keystroke technologies

github.com/BiDAlab

9 months ago

2 points

995.

AutoCodeBench: Large Language Models Are Automatic Code Benchmark Generators

github.com/Tencent-Hunyuan

9 months ago

2 points

996.

Show HN: Little Fluffy Clouds: Combine a bunch of small adjacent networks

github.com/kstrauser

9 months ago

2 points

997.

Behavior: Robot manipulation benchmark based on 1000 household tasks

github.com/StanfordVL

9 months ago

2 points

998.

Show HN: LLM‑Simple‑Eval – Easily Benchmark LLMs for Your Use Case

github.com/grigio

10 months ago

2 points

999.

PostgreSQL vs. ClickHouse: Learnings from building my first database benchmark

github.com/514-labs

a year ago

2 points

1000.

Show HN: New SWE-bench leaderboard compares LMs without fancy agent scaffolds

a year ago

2 points