HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
1.
▲
Evaluatly is now open source and free
github.com/evaluatly
1 comment
6 years ago
gpnt
1 points
2.
▲
Evals in 2025: going beyond simple benchmarks to build models people can use
github.com/huggingface
8 comments
9 months ago
jxmorris12
80 points
3.
▲
LLM Evaluation Guidebook
github.com/huggingface
discuss
2 years ago
erinys
2 points
4.
▲
HuggingFace/evaluate: A library for easily evaluating ML models and datasets
github.com/huggingface
discuss
4 years ago
occamschainsaw
2 points
5.
▲
Why Neutralinojs Is Better? Comparing with Electron and Node Webkit
github.com/neutralinojs
discuss
8 years ago
delvincasper
2 points
6.
▲
Triilman25/evaluation-machine-for-classification-models
github.com/triilman25
discuss
a year ago
triilman
1 points
7.
▲
Show HN: Pixeebot – a GitHub App that fixes your Sonar findings (Java/Python)
pixee.ai
discuss
2 years ago
nahsra
10 points
8.
▲
Show HN: Neuron – Cognitive Multi-Agent Architecture for Reasoning
discuss
10 months ago
machinemusic
8 points
9.
▲
Show HN: Auto LLM Ranker – Describe a task in English and get ranked models
github.com/gauravvij
discuss
3 months ago
gauravvij137
3 points
10.
▲
Q Evaluation Harness: open-source evals for LLMs on q/kdb+
github.com/KxSystems
discuss
10 months ago
erfan_mhi
2 points
11.
▲
Evaluate Selections in Sublime Text
github.com/jbrooksuk
2 comments
13 years ago
jbrooksuk
2 points
12.
▲
Evaluating Large Language Models Using LLM-as-a-Judge
github.com/aws-samples
discuss
2 years ago
mooreds
2 points
13.
▲
OpenFF – Automated estimation of physical properties
github.com/openforcefield
discuss
5 years ago
alex_hirner
2 points
14.
▲
Show HN: IR_evaluation – Information retrieval evaluation metrics in pure Python
github.com/plurch
2 comments
a year ago
plurch
1 points
15.
▲
Show HN: EleutherAI / Lm-Evaluation-Harness
github.com/EleutherAI
discuss
a month ago
marvinified
1 points
16.
▲
Language Model Evaluation Harness
github.com/EleutherAI
discuss
3 years ago
tosh
1 points
17.
▲
Nextdoor's Cloud Security Posture Management (CSPM) Evaluation Matrix
github.com/Nextdoor
discuss
3 years ago
scapecast
1 points
18.
▲
Show HN: Little tool to evaluate your cryptocurrency trades on Poloniex
github.com/enricobacis
discuss
9 years ago
enricobacis
1 points
19.
▲
Show HN: Freeact – A Lightweight Library for Code-Action Based Agents
github.com/gradion-ai
5 comments
a year ago
cstub
122 points
20.
▲
Deprecating A/B tests with offline policy evaluation
discuss
5 years ago
econti
1 points
21.
▲
Show HN: I designed a ChatGPT prompt evaluator to ruin your fun;)
github.com/alignedai
1 comment
4 years ago
buildaligned
8 points
22.
▲
Show HN: TypeScript type-level math expression parser and evaluator
github.com/dqbd
discuss
3 years ago
dqbd
3 points
23.
▲
Show HN: CLI tool to analyze your Vector Embeddings!
github.com/dakshjain-1616
1 comment
4 months ago
gauravvij137
2 points
24.
▲
Keyboard Layout Evaluation
github.com/bclnr
1 comment
4 years ago
Egoist
2 points
25.
▲
Evaluation Code – GPT-5 on Multimodal Medical Reasoning
github.com/wangshansong1
discuss
10 months ago
Topfi
2 points
26.
▲
Show HN: Filtering "Who's Hiring" with LLMs – native desktop app in Rust/egui
github.com/exlee
discuss
3 months ago
xlii
1 points
27.
▲
Show HN: LLM Evaluator for "Who is hiring" threads
github.com/exlee
discuss
4 months ago
xlii
1 points
28.
▲
Job postings evaluator against your resume (Chrome extension)
github.com/alikh31
discuss
5 months ago
alikhoramshahi
1 points
29.
▲
Policy Evaluation in Grid World
github.com/elliotvilhelm
discuss
2 years ago
monadicmonad
1 points
30.
▲
Tracking an LLM Evaluator Using Comet
github.com/dair-ai
discuss
3 years ago
omarsar
1 points
More