HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
211.
▲
A Case for Safe Eval
github.com/robert-j-webb
42 comments
8 years ago
_pabj
58 points
212.
▲
TensorFlow Model Analysis – A library for evaluating TensorFlow models
github.com/tensorflow
12 comments
8 years ago
wjarek
58 points
213.
▲
Show HN: A MCP server to evaluate Python code in WASM VM using RustPython
github.com/tuananh
13 comments
a year ago
tuananh
41 points
214.
▲
Show HN: Tonic Validate Metrics – an open-source RAG evaluation metrics package
github.com/TonicAI
17 comments
3 years ago
Ephil012
40 points
215.
▲
Generic engine to evaluate logical circuits on homomorphic encryption
github.com/virtualsecureplatform
3 comments
5 years ago
EvgeniyZh
38 points
216.
▲
Stop Evaluating LLMs on Vibes
github.com/truera
7 comments
3 years ago
shayaks
35 points
217.
▲
Show HN: Create LLM graders and run evals in JavaScript with one file
github.com/bolt-foundry
2 comments
a year ago
randall
28 points
218.
▲
Show HN: SumEval – Multi-language evaluation framework for text summarization
github.com/chakki-works
3 comments
9 years ago
icoxfog417
25 points
219.
▲
λ-calculus evaluator
zaach.github.com
5 comments
16 years ago
alrex021
24 points
220.
▲
Evaluate Scheme in Ruby's virtual machine
gist.github.com
2 comments
14 years ago
tenderlove
24 points
221.
▲
Numexpr: Fast numerical array expression evaluator for Python, NumPy, Pandas
github.com/pydata
4 comments
a month ago
tosh
23 points
222.
▲
Show HN: Phoenix OSS – Applying LLM Spans, Traces, and Evals for AI Insights
github.com/Arize-ai
3 comments
3 years ago
jlopes2
23 points
223.
▲
Show HN: I implemented evals metrics for LLMs that runs locally on your machine
github.com/confident-ai
3 comments
3 years ago
3d27
22 points
224.
▲
Utility to estimate tasks using PERT (Program evaluation and review technique)
github.com/arzzen
1 comment
10 years ago
arzzen
22 points
225.
▲
Thorn in a HaizeStack test for evaluating long-context adversarial robustness
github.com/haizelabs
11 comments
2 years ago
leonardtang
19 points
226.
▲
Math.mk - GNUmake eval gone wild
github.com/adam-f
4 comments
14 years ago
adam_freidin
19 points
227.
▲
Show HN: DeepEval – Evaluation and Unit Testing for LLMs
github.com/confident-ai
8 comments
3 years ago
jacky2wong
18 points
228.
▲
Python Search – eval(raw_input())
github.com
19 comments
12 years ago
Nurdok
17 points
229.
▲
Show HN: Ragas – Open-source library for evals and testing RAG systems
github.com/explodinggradients
9 comments
2 years ago
shahules
15 points
230.
▲
Show HN: An Empirical Evaluation of Linear Probing Algorithms
github.com/senderista
1 comment
7 years ago
senderista
14 points
231.
▲
Show HN: Promptloop – create, run, and improve prompt evals from the terminal
github.com/Bella3202019
3 comments
24 days ago
velapod
13 points
232.
▲
Show HN: Evaluate LLM-based RAG Applications with automated test set generation
github.com/Giskard-AI
discuss
2 years ago
RuiLyonesse
13 points
233.
▲
Common Expression Language (CEL); lightweight expression evaluation
github.com/google
5 comments
5 years ago
Wxc2jjJmST9XWWL
12 points
234.
▲
How Erlang evaluates funs (i.e. lambdas)
gist.github.com
3 comments
17 years ago
bascule
12 points
235.
▲
Show HN: UpTrain (YC W23) – open-source tool to evaluate LLM response quality
demo.uptrain.ai
discuss
3 years ago
sourabh03agr
12 points
236.
▲
Show HN: Open-source toolkit for ML model evaluation and active learning
github.com/encord-team
discuss
3 years ago
ulrikhansen54
11 points
237.
▲
Fexl – Highly robust functional evaluation
github.com/chkoreff
3 comments
12 years ago
fexl
10 points
238.
▲
Show HN: Kiln – AI Boilerplate with Evals, Fine-Tuning, Synthetic Data, and Git
github.com/Kiln-AI
1 comment
a year ago
scosman
10 points
239.
▲
Pixar just open sourced their high-performance subdivision evaluator
github.com/PixarAnimationStudios
discuss
14 years ago
ColinWright
10 points
240.
▲
Show HN: C++ Mathematical Expression Parser and Evaluation Benchmark
github.com/ArashPartow
discuss
8 years ago
ArashPartow
10 points
More