HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
931.
▲
Show HN: Benchmarking language models by playing text adventures
github.com/s-macke
1 comment
3 years ago
s-macke
2 points
932.
▲
Web Content Compression Benchmark.zlib brotli zstd zlib_ng libdeflate igzip,...
github.com/powturbo
1 comment
3 years ago
powturbo
2 points
933.
▲
Show HN: TurboBench: Dynamic/Static web content compression benchmark
github.com/powturbo
1 comment
3 years ago
powturbo
2 points
934.
▲
Wa-SQLite (WASM SQLite) benchmark discussion
github.com/rhashimoto
1 comment
3 years ago
yonz
2 points
935.
▲
Is linear regression better than prophet? Zillow benchmark
github.com/Nixtla
1 comment
4 years ago
fedegr
2 points
936.
▲
[benchmarks] MongoDB kicks MySQL's ass no matter the circumstances
1 comment
15 years ago
zeeone
2 points
937.
▲
Server Benchmarks For: Elixir Ruby Nim Node Clojure Java Rust Python Go Crystal
github.com/costajob
1 comment
10 years ago
asp2insp
2 points
938.
▲
Gobenchdb: store go test bench data in a database
github.com/yhat
discuss
11 years ago
emcox
2 points
939.
▲
Show HN: Proof of concept for using HTTP headers to benchmark latency
github.com/montanaflynn
discuss
12 years ago
anonfunction
2 points
940.
▲
A benchmarking suite for PHP implementations running real-world apps
github.com/hhvm
discuss
12 years ago
jamesgpearce
2 points
941.
▲
Readygo, a Ruby benchmarking tool by Gary Bernhardt
github.com/garybernhardt
discuss
12 years ago
isbadawi
2 points
942.
▲
Scala Web Frameworks Benchmark
github.com/Versal
discuss
13 years ago
jondot
2 points
943.
▲
Show HN: InferBench – Benchmark local LLM engines with one click
github.com/JoniMartin27
discuss
20 days ago
JoniMartin
2 points
944.
▲
BrowseComp-Plus: A More Fair and Transparent Benchmark of Deep-Research Agent
github.com/texttron
discuss
20 days ago
colonCapitalDee
2 points
945.
▲
Show HN: AgentThreatBench – Benchmark for AI Agent Memory Security
github.com/OWASP
discuss
24 days ago
vgudur297
2 points
946.
▲
Prompter – Compare and benchmark Ollama models side-by-side in your terminal
github.com/whonixnetworks
discuss
a month ago
whonixnetworks
2 points
947.
▲
Show HN: 97% on SWE-bench Verified with subscription-token agents
github.com/kimjune01
discuss
a month ago
kimjune01
2 points
948.
▲
Show HN: Verdict – model evals on your own data, not someone else's benchmark
github.com/aevyraai
discuss
2 months ago
agunapal
2 points
949.
▲
talkie-coder: From 1930 to SWE-bench
github.com/RicardoDominguez
discuss
2 months ago
Philpax
2 points
950.
▲
Open macro placement benchmark and $20k challenge (HRT-sponsored)
github.com/partcleda
discuss
3 months ago
anonymousmoos
2 points
951.
▲
Show HN: WMB-100K – Open benchmark for AI memory systems at 100K turns
github.com/Irina1920
discuss
3 months ago
wontopos
2 points
952.
▲
Show HN: OpenClaw Arena – Benchmark models on real tasks, rank by perf and cost
app.uniclaw.ai
discuss
3 months ago
skysniper
2 points
953.
▲
An open source benchmarking framework for IT automation
github.com/itbench-hub
discuss
3 months ago
pranay01
2 points
954.
▲
Mitata: Benchmark tooling that loves you
github.com/evanwashere
discuss
3 months ago
jcbhmr
2 points
955.
▲
Help me improving this benchmark for vector engines
github.com/M4iKZ
discuss
3 months ago
M4iKZ
2 points
956.
▲
Some critical issues with the SWE-bench-Pro environments
github.com/SWE-agent
discuss
3 months ago
snoopyswe
2 points
957.
▲
BetterKV – A multithreaded Rust Redis alternative, 10-30x faster in benchmarks
discuss
3 months ago
1jmdev
2 points
958.
▲
Show HN: ModelSweep - Open-Source Benchmarking for Local LLMs
github.com/leonickson1
discuss
3 months ago
leonickson
2 points
959.
▲
FratBench – Social Calibration Benchmark (OAI Scores Dead Last) [pdf]
github.com/richar-wang
discuss
3 months ago
richardwang5
2 points
960.
▲
TLAi+ Benchmarks for Evaluating LLMs
github.com/tlaplus
discuss
4 months ago
alhazrod
2 points
More