Search: github.com/bendc | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

961.

TLAi+ Benchmarks for Evaluating LLMs

github.com/tlaplus

4 months ago

2 points

962.

An Nginx Engineer Took over AI's Benchmark Tool

github.com/hongzhidao

5 months ago

2 points

963.

KiteSQL: Rust-native embedded SQL with TPC-C benchmarks and WASM support

github.com/KipData

5 months ago

2 points

964.

WorkBench-Pro – PC benchmark designed for developer workflows

github.com/johanmcad

5 months ago

2 points

965.

Benchmark Comparison: JSONL vs. TOON output for JSON-render efficiency

github.com/vercel-labs

5 months ago

2 points

966.

Show HN: Rerankers – Models, benchmarks, and papers for RAG

github.com/agentset-ai

5 months ago

2 points

967.

Show HN: sc-membench for modern memory bandwidth and latency benchmarks

github.com/spareCores

5 months ago

2 points

968.

Show HN: Long-horizon LLM coherence benchmark (500 cycles)

5 months ago

2 points

969.

Epiplexity to Beat DeepMind's Alchemy Meta RL Benchmark

github.com/RandMan444

6 months ago

2 points

970.

Show HN: JSONBench, a Benchmark for Data Analytics on JSON

github.com/ClickHouse

6 months ago

2 points

971.

Stop benchmarking LLMs. Make them fight

github.com/AGI-Eval-Official

6 months ago

2 points

972.

Show HN: Sigma Runtime – 550-cycle identity stability benchmark on GPT-5.2

github.com/sigmastratum

6 months ago

2 points

973.

Benchmarking LLMs on whether they can play FizzBuzz

github.com/venkatasg

6 months ago

2 points

974.

Running a 270M LLM on Android (architecture and benchmarks)

7 months ago

2 points

975.

TypeNet Benchmark for development of authentication keystroke technologies

github.com/BiDAlab

9 months ago

2 points

976.

AutoCodeBench: Large Language Models Are Automatic Code Benchmark Generators

github.com/Tencent-Hunyuan

9 months ago

2 points

977.

Behavior: Robot manipulation benchmark based on 1000 household tasks

github.com/StanfordVL

9 months ago

2 points

978.

Show HN: LLM‑Simple‑Eval – Easily Benchmark LLMs for Your Use Case

github.com/grigio

10 months ago

2 points

979.

PostgreSQL vs. ClickHouse: Learnings from building my first database benchmark

github.com/514-labs

a year ago

2 points

980.

Show HN: New SWE-bench leaderboard compares LMs without fancy agent scaffolds

a year ago

2 points

981.

Show HN: VDBbench 1.0: open-source benchmarking for VectorDBs

github.com/zilliztech

a year ago

2 points

982.

MAIR: A Benchmark for Evaluating Instructed Retrieval

github.com/sunnweiwei

a year ago

2 points

983.

Show HN: Comprehensive Benchmark Suite for Story Visualization

github.com/ViStoryBench

a year ago

2 points

984.

Show HN: Benchmarks agree with the complexity analysis of the TopoSort algorithm

github.com/williamw520

a year ago

2 points

985.

Show HN: I built an open-source benchmark that evaluates LLMs through gameplay

a year ago

2 points

986.

QuickBench: A Zero-Dependency Linux Benchmark for CPU, Memory, and Storage

github.com/bearstech

a year ago

2 points

987.

Elimination Game Benchmark: Social Reasoning, Strategy, and Deception in LLMs

github.com/lechmazur

a year ago

2 points

988.

Latest Benchmarks Show 10x Faster Prefix Queries vs. Etcd

2 years ago

2 points

989.

C++ Showing std:swap faster than XOR trick to swap numbers via naive benchmark

github.com/vladov3000

2 years ago

2 points

990.

Benchmarks Comparing PyTorch and MLX on Apple Silicon GPUs

github.com/LucasSte

2 years ago

2 points