HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
241.
▲
Show HN: Texthero, a Pandas-like API to work with text-dataset only
github.com/jbesomi
discuss
6 years ago
jonathanbesomi
4 points
242.
▲
Russian Open Speech to Text (STT/ASR) Dataset
github.com/snakers4
discuss
7 years ago
isqad
4 points
243.
▲
Awesome-Twitter-data: A list of Twitter datasets and related resources
github.com/shaypal5
discuss
8 years ago
shaypalachy
4 points
244.
▲
PeerRead: A Dataset of Scientific Peer Reviews
github.com/allenai
discuss
8 years ago
indescions_2018
4 points
245.
▲
Pypixgrid: generate vector tiles for the exploration of spatio-temporal datasets
translate.googleusercontent.com
discuss
9 years ago
based2
4 points
246.
▲
Dat – Distributed Dataset Synchronization and Versioning [pdf]
github.com/datproject
discuss
9 years ago
potomak
4 points
247.
▲
Show HN: DataBrewer – A CLI-tool to search and discover datasets
github.com/rolando
discuss
9 years ago
darkrho
4 points
248.
▲
Udacity adds 183gb of data to its driving dataset
github.com/udacity
discuss
10 years ago
EvgeniyZh
4 points
249.
▲
Show HN: Create simulated datasets in Python with Simulacrum
github.com/jbrambleDC
discuss
10 years ago
jbrambleDC
4 points
250.
▲
Show HN: Kiln - Interactive LLM fine-tuning, dataset collab & synthetic data gen
github.com/Kiln-AI
2 comments
2 years ago
scosman
3 points
251.
▲
Large New Dataset 220k AI Art Text to Image Prompts
github.com/lee101
2 comments
3 years ago
wrdsmsh321
3 points
252.
▲
hfsearch: a fast cli tool to discover models and datasets on HuggingFace
github.com/HenokB
1 comment
7 months ago
henok_ademtew
3 points
253.
▲
Show HN: Torque – A declarative, typesafe DSL for LLM training datasets (MIT)
github.com/qforge-dev
1 comment
8 months ago
michalwarda
3 points
254.
▲
Hugging Face AI Sheets, open-source tool to vibe test models on your datasets
github.com/huggingface
1 comment
10 months ago
dvilasuero
3 points
255.
▲
Promptwright: Generate large synthetic datasets using a local LLM
github.com/StacklokLabs
1 comment
2 years ago
trickleup
3 points
256.
▲
Easily convert YouTube, Torrent and Enterprise videos into LLM datasets
github.com/qet-lab
1 comment
2 years ago
m_2018
3 points
257.
▲
CodeCapybara: Code Writing LLaMa Finetuned on Deepmind Dataset
github.com/AI4Code-Research
1 comment
3 years ago
brucethemoose2
3 points
258.
▲
UpliftML: An uplift modeling library that handles web scale datasets
github.com/bookingcom
1 comment
5 years ago
TaXxEr
3 points
259.
▲
A tool for creating deep learning datasets
github.com/dicroce
1 comment
5 years ago
dicroce
3 points
260.
▲
Show HN: A dataset of 40k professionally-written summaries of news articles
github.com/curationcorp
1 comment
6 years ago
CurationCorp
3 points
261.
▲
Crossfader: Autoencoders to find structure in arbitrary datasets
github.com/bettermg
discuss
11 years ago
vierja
3 points
262.
▲
Machine Learning: Access Tiny Images Dataset with Python
github.com/cioc
discuss
13 years ago
cioc
3 points
263.
▲
Show HN: Synthetic corporate dataset generator for AI agent evaluation
github.com/aeriesec
discuss
12 days ago
jflynt76
3 points
264.
▲
Open Data Hub Data Browser – Explore and Query Open Datasets
github.com/noi-techpark
discuss
4 months ago
KadambariSuresh
3 points
265.
▲
JQuery dataset() Plugin
github.com/realchaseadams
discuss
14 years ago
nwienert
3 points
266.
▲
WebZFS Modern Web Management for ZFS Pools/Datasets/Snapshots/Smart Monitoring
github.com/webzfs
discuss
6 months ago
vermaden
3 points
267.
▲
Data-morph: Morph a dataset into select shapes, while preserving the statistics
github.com/stefmolin
discuss
9 months ago
ZeljkoS
3 points
268.
▲
Show HN: Synthetic dataset generator for NLP and tabular data
github.com/VoxDroid
discuss
a year ago
voxdroid
3 points
269.
▲
DataChain: Prepare and curate datasets for AI/ML
github.com/iterative
discuss
2 years ago
shcheklein
3 points
270.
▲
Reladiff: High-performance diffing of large datasets across databases
github.com/erezsh
discuss
2 years ago
PaulHoule
3 points
More