HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
301.
▲
Eval("quire".replace(/^/,"re"))(moduleName)
github.com/protobufjs
2 comments
2 years ago
yonom
3 points
302.
▲
Homomorphic Evaluation of the AES Circuit
github.com/shaih
1 comment
11 years ago
johlo
3 points
303.
▲
Show HN: Atlas – local-first memory that re-evaluates beliefs when facts change
github.com/RichSchefren
1 comment
11 days ago
RichSchefren
3 points
304.
▲
RAG Eval Comparing Vertex/Bedrock/Azure/OpenAI
github.com/colon-md
1 comment
a month ago
colon-md
3 points
305.
▲
Mcpbr: Stop guessing and evaluate your MCP server against standard benchmarks
github.com/greynewell
1 comment
5 months ago
captradeoff
3 points
306.
▲
Rogue: Open-source AI agent evaluation framework
github.com/qualifire-dev
1 comment
8 months ago
drorivryQF
3 points
307.
▲
AWorld: Build, evaluate and train General Multi-Agent Assistance with ease
github.com/inclusionAI
1 comment
10 months ago
gfortaine
3 points
308.
▲
15 AI Coding Agents evaluated with the same prompt
github.com/The-Focus-AI
1 comment
a year ago
combray
3 points
309.
▲
NoLiMa: Long-Context Evaluation Beyond Literal Matching
github.com/adobe-research
1 comment
a year ago
consumer451
3 points
310.
▲
I built an ethical evaluation engine for scoring sys. alignment, not efficiency
github.com/luminaAnonima
1 comment
a year ago
luminaAnonima
3 points
311.
▲
A novel open-source framework for evaluating conversational agents
github.com/plurai-ai
1 comment
a year ago
nirdiamant
3 points
312.
▲
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale
github.com/microsoft
1 comment
2 years ago
francedot
3 points
313.
▲
Cedar – open-source policy language and evaluation engine
github.com/cedar-policy
1 comment
3 years ago
max2
3 points
314.
▲
Show HN: Evaluate Deep Learning models directly in a database with PyNeuraLogic
github.com/LukasZahradnik
1 comment
4 years ago
LukasZahradnik
3 points
315.
▲
Show HN: Wielder – Write and evaluate Clojure code in your Obsidian documents
github.com/victorb
1 comment
4 years ago
diggan
3 points
316.
▲
Show HN: Oyster, an interactive Perl eval server
github.com/gatlin
1 comment
15 years ago
gatlin
3 points
317.
▲
Koila: Prevent PyTorch's out of memory error with lazy evaluation
github.com/rentruewang
1 comment
5 years ago
b06901038
3 points
318.
▲
Simple Safe Sandboxed Extensible Expression Evaluator for Python
github.com/danthedeckie
1 comment
8 years ago
wilsonfiifi
3 points
319.
▲
Show HN: ClojureCalc, a libreoffice Calc Add-In to evaluate clojure expressions
github.com/beothorn
discuss
11 years ago
beothorn
3 points
320.
▲
Rouge.js: Recall-Oriented Understudy for Gisting Evaluation Metric
github.com/kenlimmj
discuss
11 years ago
kenlimmj
3 points
321.
▲
Show HN: Synthetic corporate dataset generator for AI agent evaluation
github.com/aeriesec
discuss
12 days ago
jflynt76
3 points
322.
▲
Cisco Foundry Security Spec: Open specification for agentic security evaluation
github.com/CiscoDevNet
discuss
a month ago
cpard
3 points
323.
▲
Show HN: Nexa-gauge – Cache/cost-aware graph-based eval for LLM and RAG
github.com/harnexa
discuss
a month ago
Sardhendu
3 points
324.
▲
Show HN: FC-Eval – CLI to Benchmark Local or Cloud LLMs on Function Calling
github.com/gauravvij
discuss
3 months ago
gauravvij137
3 points
325.
▲
Show HN: Rhesis AI - Multimodal test cases for agentic evals
discuss
3 months ago
nicolaib
3 points
326.
▲
Show HN: Auditi – open-source LLM tracing and evaluation platform
github.com/deduu
discuss
4 months ago
ariansyah
3 points
327.
▲
Harbor – a framework for evaluating and optimizing agents and language models
github.com/laude-institute
discuss
7 months ago
piebro
3 points
328.
▲
OpenBench: Provider-agnostic, open-source evaluation infrastructure for LLMs
github.com/groq
discuss
8 months ago
ofou
3 points
329.
▲
Show HN: Evaluate your website usability in seconds
desplega.ai
discuss
9 months ago
tarasyarema
3 points
330.
▲
LLM Evaluation via Rap Battles
github.com/vadim0x60
discuss
10 months ago
vadimdotme
3 points
More