HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
241.
▲
Show HN: DataBrewer – A CLI-tool to search and discover datasets
github.com/rolando
discuss
9 years ago
darkrho
4 points
242.
▲
Udacity adds 183gb of data to its driving dataset
github.com/udacity
discuss
10 years ago
EvgeniyZh
4 points
243.
▲
Show HN: Create simulated datasets in Python with Simulacrum
github.com/jbrambleDC
discuss
10 years ago
jbrambleDC
4 points
244.
▲
A Python tool that automatically cleans data sets and readies them for analysis
github.com/rhiever
discuss
10 years ago
felix_thursday
4 points
245.
▲
Show HN: Kiln - Interactive LLM fine-tuning, dataset collab & synthetic data gen
github.com/Kiln-AI
2 comments
2 years ago
scosman
3 points
246.
▲
Large New Dataset 220k AI Art Text to Image Prompts
github.com/lee101
2 comments
3 years ago
wrdsmsh321
3 points
247.
▲
hfsearch: a fast cli tool to discover models and datasets on HuggingFace
github.com/HenokB
1 comment
7 months ago
henok_ademtew
3 points
248.
▲
Show HN: Torque – A declarative, typesafe DSL for LLM training datasets (MIT)
github.com/qforge-dev
1 comment
8 months ago
michalwarda
3 points
249.
▲
Hugging Face AI Sheets, open-source tool to vibe test models on your datasets
github.com/huggingface
1 comment
10 months ago
dvilasuero
3 points
250.
▲
Promptwright: Generate large synthetic datasets using a local LLM
github.com/StacklokLabs
1 comment
2 years ago
trickleup
3 points
251.
▲
Easily convert YouTube, Torrent and Enterprise videos into LLM datasets
github.com/qet-lab
1 comment
2 years ago
m_2018
3 points
252.
▲
CodeCapybara: Code Writing LLaMa Finetuned on Deepmind Dataset
github.com/AI4Code-Research
1 comment
3 years ago
brucethemoose2
3 points
253.
▲
UpliftML: An uplift modeling library that handles web scale datasets
github.com/bookingcom
1 comment
5 years ago
TaXxEr
3 points
254.
▲
A tool for creating deep learning datasets
github.com/dicroce
1 comment
5 years ago
dicroce
3 points
255.
▲
Show HN: A dataset of 40k professionally-written summaries of news articles
github.com/curationcorp
1 comment
6 years ago
CurationCorp
3 points
256.
▲
Crossfader: Autoencoders to find structure in arbitrary datasets
github.com/bettermg
discuss
11 years ago
vierja
3 points
257.
▲
ExCon is an R/JavaScript tool for exploring topographic-like data sets
github.com/bryanhanson
discuss
12 years ago
sebg
3 points
258.
▲
Machine Learning: Access Tiny Images Dataset with Python
github.com/cioc
discuss
13 years ago
cioc
3 points
259.
▲
Show HN: Synthetic corporate dataset generator for AI agent evaluation
github.com/aeriesec
discuss
12 days ago
jflynt76
3 points
260.
▲
Open Data Hub Data Browser – Explore and Query Open Datasets
github.com/noi-techpark
discuss
4 months ago
KadambariSuresh
3 points
261.
▲
JQuery dataset() Plugin
github.com/realchaseadams
discuss
14 years ago
nwienert
3 points
262.
▲
WebZFS Modern Web Management for ZFS Pools/Datasets/Snapshots/Smart Monitoring
github.com/webzfs
discuss
6 months ago
vermaden
3 points
263.
▲
Data-morph: Morph a dataset into select shapes, while preserving the statistics
github.com/stefmolin
discuss
9 months ago
ZeljkoS
3 points
264.
▲
Show HN: Synthetic dataset generator for NLP and tabular data
github.com/VoxDroid
discuss
a year ago
voxdroid
3 points
265.
▲
DataChain: Prepare and curate datasets for AI/ML
github.com/iterative
discuss
2 years ago
shcheklein
3 points
266.
▲
Reladiff: High-performance diffing of large datasets across databases
github.com/erezsh
discuss
2 years ago
PaulHoule
3 points
267.
▲
RNNoise 0.2 – now trained using only publicly available CC-licensed datasets
github.com/xiph
discuss
2 years ago
pabs3
3 points
268.
▲
ClickHouse-Obfuscator – a tool for dataset anonymization
github.com/ClickHouse
discuss
3 years ago
aeontech
3 points
269.
▲
CommaVQ: Dataset of 100k Driving Videos
github.com/commaai
discuss
3 years ago
kklisura
3 points
270.
▲
Img2dataset: Turns large sets of image URLs to an image dataset
github.com/rom1504
discuss
3 years ago
wildpeaks
3 points
More