HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
31.
▲
Show HN: Transform Unstructured Data into Usable Datasets
github.com/wizenheimer
1 comment
2 years ago
wizenheimer
4 points
32.
▲
Show HN: pqry – A fast, lightweight CLI tool to diagnose Parquet datasets
github.com/symblic
discuss
5 months ago
setzeno
4 points
33.
▲
Show HN: Lance – Open lakehouse format for multimodal AI datasets
github.com/lance-format
discuss
5 months ago
criexe
4 points
34.
▲
A curated list of global electrical grid maps, datasets and resources
github.com/open-energy-transition
discuss
8 months ago
protontypes
4 points
35.
▲
The Well: A 15TB Collection of Physics Simulation Datasets
github.com/PolymathicAI
discuss
9 months ago
Anon84
4 points
36.
▲
Show HN: Mount remote repositories and datasets managed by Git LFS locally
github.com/git-lfs-fuse
discuss
a year ago
rueian
4 points
37.
▲
Pypixgrid: generate vector tiles for the exploration of spatio-temporal datasets
translate.googleusercontent.com
discuss
9 years ago
based2
4 points
38.
▲
Show HN: Create simulated datasets in Python with Simulacrum
github.com/jbrambleDC
discuss
10 years ago
jbrambleDC
4 points
39.
▲
Hugging Face AI Sheets, open-source tool to vibe test models on your datasets
github.com/huggingface
1 comment
10 months ago
dvilasuero
3 points
40.
▲
RNNoise 0.2 – now trained using only publicly available CC-licensed datasets
github.com/xiph
discuss
2 years ago
pabs3
3 points
41.
▲
Trustfall: A new, datasource-agnostic way to connect and query datasets
github.com/obi1kenobi
discuss
4 years ago
tosh
3 points
42.
▲
Covid-19 datasets by Our World in Data updated daily
github.com/owid
discuss
4 years ago
escot
3 points
43.
▲
xarray: N-Dimensional labeled arrays and datasets in Python
github.com/pydata
discuss
5 years ago
teleforce
3 points
44.
▲
Show HN: Flexible data exploration for mid-size datasets
github.com/stefanhoelzl
discuss
7 years ago
stefanhoelzl
3 points
45.
▲
Show HN: Masquerade: A Postgres proxy that masks sensitive datasets in real time
github.com/TonicAI
discuss
7 years ago
akamor
3 points
46.
▲
Dendrite: Querying large datasets on a single host at near-interactive speeds
github.com/jwhitbeck
discuss
7 years ago
tosh
3 points
47.
▲
Maptable: Converts datasets to heat map, filters and table [JS]
github.com/Packet-Clearing-House
discuss
7 years ago
pjf
3 points
48.
▲
Analyze large healthcare datasets and build ML models using TensorFlow
github.com/AKSHAYUBHAT
discuss
9 years ago
happy-go-lucky
3 points
49.
▲
The World Factbook: datasets for the country profiles
github.com
1 comment
5 months ago
1659447091
2 points
50.
▲
Yahoo Knowledge Graph Covid-19 Datasets, API, and Dashboard
github.com/yahoo
1 comment
6 years ago
riemannzeta
2 points
51.
▲
Show HN: Fast and simple access self-driving datasets
github.com/snarkai
1 comment
6 years ago
davidbuniat
2 points
52.
▲
Fast Filter: fast and efficient filtering of large datasets
github.com/mobmewireless
discuss
11 years ago
_justin
2 points
53.
▲
Bigvis - tools for exploratory data analysis of large datasets
github.com/hadley
discuss
13 years ago
shrikant
2 points
54.
▲
Show HN: DataFlow,Turn raw data into high-quality LLM training datasets
github.com/OpenDCAI
discuss
3 months ago
Junnn
2 points
55.
▲
Show HN: Build ML training datasets from large-scale satellite/aerial imagery
github.com/noahgolmant
discuss
6 months ago
noahgolmant
2 points
56.
▲
A curated list of global electrical grid maps, datasets and resources
github.com/open-energy-transition
discuss
7 months ago
protontypes
2 points
57.
▲
DataChain: Prepare and curate datasets for AI/ML
github.com/iterative
discuss
2 years ago
shcheklein
2 points
58.
▲
Reladiff: High-performance diffing of large datasets across databases
github.com/erezsh
discuss
2 years ago
todsacerdoti
2 points
59.
▲
OpenForest – A catalogue of open access forest datasets
github.com/RolnickLab
discuss
2 years ago
Brajeshwar
2 points
60.
▲
RedPajama-Data: Code for preparing large datasets
github.com/togethercomputer
discuss
3 years ago
harrisonpowers
2 points
More