Search: github.com/eval | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

31.

Show HN: EvalView pytest style tests for AI agents (budgets, hallucinations)

github.com/hidai25

6 months ago

1 points

32.

Evaluatly is now open source and free

github.com/evaluatly

6 years ago

1 points

33.

Show HN: EvalLens – Open-source tool to evaluate structured LLM outputs

github.com/simonrendona

3 months ago

1 points

34.

Evalien – Node.js event loop agent harness

github.com/agentbellnorm

4 months ago

1 points

35.

Show HN: Evals skill for agents – no tooling, just Markdown and subagents

github.com/adriancooney

5 months ago

1 points

36.

Evalite: Evaluate your LLM-powered apps with TypeScript

github.com/mattpocock

6 months ago

1 points

37.

Triilman25/evaluation-machine-for-classification-models

github.com/triilman25

a year ago

1 points

38.

Eval Villain Update released Find those dangerous JavaScript functions

github.com/swoops

2 years ago

1 points

39.

GitHub Action for Cluster API

github.com/evalsocket

6 years ago

1 points

40.

Eval – a bot that executes arbitrary JavaScript and posts the result on Plemora

github.com/CosineP

6 years ago

1 points

41.

Evaldb: Use your favorite language as a database

github.com/turbio

7 years ago

1 points

42.

Show HN: Evalfilter: A simple Golang evaluation engine for filtering via scripts

7 years ago

1 points

43.

Evaluate JavaScript code blocks from within markdown

github.com/reggi

11 years ago

1 points

44.

Show HN: Open-source alternative to ChatGPT Agents for browsing

github.com/trymeka

a year ago

104 points

45.

Show HN: ColiVara – State of the Art RAG API with Vision Models

github.com/tjmlabs

2 years ago

10 points

46.

Show HN: Pixeebot – a GitHub App that fixes your Sonar findings (Java/Python)

2 years ago

10 points

47.

Show HN: Neuron – Cognitive Multi-Agent Architecture for Reasoning

10 months ago

8 points

48.

Show HN: Auto LLM Ranker – Describe a task in English and get ranked models

github.com/gauravvij

3 months ago

3 points

49.

Q Evaluation Harness: open-source evals for LLMs on q/kdb+

github.com/KxSystems

10 months ago

2 points

50.

Show HN: FizzBuzz purely in Rust's trait system

github.com/doctorn

6 years ago

120 points

51.

Show HN: Duktape-eval – a eval library built on Duktape and WebAssembly

github.com/maple3142

6 years ago

41 points

52.

Show HN: Pytest-evals – Simple LLM apps evaluation using pytest

github.com/AlmogBaku

a year ago

13 points

53.

Show HN: Agent-evals – Claude skill to build your own evals

github.com/fsilavong

2 months ago

9 points

54.

EvalAI: An open-source alternative of Kaggle

github.com/Cloud-CV

9 years ago

6 points

55.

Estonia's voting system: a python program on GitHub

github.com/vvk-ehk

10 years ago

5 points

56.

github.com/garrytan

2 months ago

4 points

57.

I tested Haiku vs. Sonnet across 3 agent tasks – the cheap model won every time

github.com/aimvik07

a month ago

3 points

58.

GPT-4o Benchmark Results

github.com/openai

2 years ago

3 points

59.

OpenAI/Simple-Evals

github.com/openai

2 years ago

3 points

60.

Show HN: Retrieval Evaluations Framework

github.com/DeployQL

2 years ago

3 points