Search: github.com/b1nc | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

961.

Show HN: mlx-chronos - benchmark MLX inference engines on Apple Silicon

github.com/igurss

5 hours ago

2 points

962.

Benchmark unlimited Claude.md files against eachother

github.com/emiliolugo

10 hours ago

2 points

963.

Show HN: InferBench – Benchmark local LLM engines with one click

github.com/JoniMartin27

20 days ago

2 points

964.

BrowseComp-Plus: A More Fair and Transparent Benchmark of Deep-Research Agent

github.com/texttron

21 days ago

colonCapitalDee

2 points

965.

Show HN: AgentThreatBench – Benchmark for AI Agent Memory Security

github.com/OWASP

25 days ago

2 points

966.

Prompter – Compare and benchmark Ollama models side-by-side in your terminal

github.com/whonixnetworks

a month ago

2 points

967.

Show HN: 97% on SWE-bench Verified with subscription-token agents

github.com/kimjune01

a month ago

2 points

968.

Show HN: Verdict – model evals on your own data, not someone else's benchmark

github.com/aevyraai

2 months ago

2 points

969.

talkie-coder: From 1930 to SWE-bench

github.com/RicardoDominguez

2 months ago

2 points

970.

Open macro placement benchmark and $20k challenge (HRT-sponsored)

github.com/partcleda

3 months ago

2 points

971.

Show HN: WMB-100K – Open benchmark for AI memory systems at 100K turns

github.com/Irina1920

3 months ago

2 points

972.

Show HN: OpenClaw Arena – Benchmark models on real tasks, rank by perf and cost

3 months ago

2 points

973.

An open source benchmarking framework for IT automation

github.com/itbench-hub

3 months ago

2 points

974.

Mitata: Benchmark tooling that loves you

github.com/evanwashere

3 months ago

2 points

975.

Help me improving this benchmark for vector engines

github.com/M4iKZ

3 months ago

2 points

976.

Some critical issues with the SWE-bench-Pro environments

github.com/SWE-agent

3 months ago

2 points

977.

BetterKV – A multithreaded Rust Redis alternative, 10-30x faster in benchmarks

3 months ago

2 points

978.

Show HN: ModelSweep - Open-Source Benchmarking for Local LLMs

github.com/leonickson1

3 months ago

2 points

979.

FratBench – Social Calibration Benchmark (OAI Scores Dead Last) [pdf]

github.com/richar-wang

3 months ago

2 points

980.

TLAi+ Benchmarks for Evaluating LLMs

github.com/tlaplus

4 months ago

2 points

981.

An Nginx Engineer Took over AI's Benchmark Tool

github.com/hongzhidao

5 months ago

2 points

982.

KiteSQL: Rust-native embedded SQL with TPC-C benchmarks and WASM support

github.com/KipData

5 months ago

2 points

983.

WorkBench-Pro – PC benchmark designed for developer workflows

github.com/johanmcad

5 months ago

2 points

984.

Benchmark Comparison: JSONL vs. TOON output for JSON-render efficiency

github.com/vercel-labs

5 months ago

2 points

985.

Show HN: Rerankers – Models, benchmarks, and papers for RAG

github.com/agentset-ai

5 months ago

2 points

986.

Show HN: sc-membench for modern memory bandwidth and latency benchmarks

github.com/spareCores

5 months ago

2 points

987.

Show HN: Long-horizon LLM coherence benchmark (500 cycles)

5 months ago

2 points

988.

Epiplexity to Beat DeepMind's Alchemy Meta RL Benchmark

github.com/RandMan444

6 months ago

2 points

989.

Show HN: JSONBench, a Benchmark for Data Analytics on JSON

github.com/ClickHouse

6 months ago

2 points

990.

Stop benchmarking LLMs. Make them fight

github.com/AGI-Eval-Official

6 months ago

2 points