HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
451.
▲
Not_notMNIST: Generate your own datasets
1 comment
9 years ago
RafazZ
1 points
452.
▲
Clusterize.js: Tiny vanilla JavaScript plugin to display large data sets easily
github.com/NeXTs
discuss
11 years ago
IvoBorg
1 points
453.
▲
Show HN: Automatic Validation, Correction and Generation of Dataset Metadata
github.com/ahmadassaf
discuss
11 years ago
ahmadassaf
1 points
454.
▲
Small ground truth labeled dataset for swedish parking signs
github.com/klintan
discuss
12 years ago
klintcho
1 points
455.
▲
Dataset for 22 years of arXiv citation links
github.com/paperscape
discuss
12 years ago
robjk
1 points
456.
▲
Rdfdiff -- Scalable Tool To Detect Changes in Billion Triple Data Sets
github.com/paulhoule
discuss
13 years ago
PaulHoule
1 points
457.
▲
A full-stack Last.fm 1k dataset insights page using Go/ClickHouse/React
github.com/el10savio
discuss
21 days ago
ugabuga
1 points
458.
▲
Show HN: Cohort Visualizer - A handy tool for browsing cohort datasets
bslatkin.github.com
discuss
14 years ago
bslatkin
1 points
459.
▲
Swedish Construction FAQ: 503 bilingual Q&A dataset, CC BY 4.0
github.com/zaragoza-ab
discuss
2 months ago
DecDEPO
1 points
460.
▲
Show HN: Fastdedup – Rust dataset deduplication (2:55 vs. 7:55 688MB vs. 22GB)
wapplewhite4.github.io
discuss
4 months ago
wapplewhite4
1 points
461.
▲
GABRIEL – turn messy qualitative corpora into analysis-ready datasets
github.com/openai
discuss
5 months ago
michaelsbradley
1 points
462.
▲
Show HN: Vietnam Elections (open, source-linked datasets and site)
bamboo-filing-cabinet.github.io
discuss
5 months ago
vietthan
1 points
463.
▲
The Guardian Headline Entailment Training Dataset
github.com/daoudclarke
discuss
14 years ago
daoudc
1 points
464.
▲
Fasttfidf: High-performance TF-IDF vectorization for large-scale text datasets
github.com/purijs
discuss
6 months ago
jspuri
1 points
465.
▲
Show HN: AI tool that walks citation graph and extracts data to create datasets
github.com/eamag
discuss
6 months ago
eamag
1 points
466.
▲
Training YOLO vision models on Kaggle datasets
github.com/mfranzon
discuss
8 months ago
walterbell
1 points
467.
▲
Show HN: Gaggle – A DuckDB extension for working with Kaggle datasets
discuss
8 months ago
habedi0
1 points
468.
▲
Show HN: I built a tool to sort a Northern Lights dataset for a CV model
picsort.coolapso.sh
discuss
8 months ago
coolapso
1 points
469.
▲
Show HN: Django PostgreSQL Anonymizer – prod → safe dev datasets (beta)
github.com/CuriousLearner
discuss
8 months ago
sanyam-khurana
1 points
470.
▲
A toolkit for improving the quality of your LeRobot datasets
github.com/RoboticsData
discuss
8 months ago
machinelearning
1 points
471.
▲
A new RAG algorithm to self-heal damaged datasets and query them on a graph
github.com/iblameandrew
discuss
9 months ago
scraper02
1 points
472.
▲
Show HN: Tensorpack a CLI tool for semantic discovery across datasets
discuss
9 months ago
AyodeleFikayomi
1 points
473.
▲
Procedural Reasoning Datasets
github.com/open-thought
discuss
a year ago
t55
1 points
474.
▲
Reasoning Gym – Procedural RL reasoning datasets
github.com/open-thought
discuss
a year ago
t55
1 points
475.
▲
Mochi Programming Language v0.6.0 – LINQ syntax for querying datasets
github.com/mochilang
discuss
a year ago
scapbi
1 points
476.
▲
Reasoning Gym: Procedural Dataset Generation for Reinforcement Learning
github.com/open-thought
discuss
a year ago
starzmustdie
1 points
477.
▲
Datasets Are All You Need (LLM Learns to Prompt from Data)
github.com/intellectronica
discuss
a year ago
intellectronica
1 points
478.
▲
A multi-view video behavior monitoring dataset of wild mammals in the Swiss Alps
github.com/eceo-epfl
discuss
a year ago
moatmoat
1 points
479.
▲
RKaggle: Bring Kaggle Datasets Straight into the R console
github.com/benyamindsmith
discuss
a year ago
SuperMint
1 points
480.
▲
Logic R1: Reproduce DeepSeek R1 Zero on 2K Logic Puzzle Dataset
github.com/Unakar
discuss
a year ago
limoce
1 points
More