HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
91.
▲
Show HN: Automate Variable Selection for Research on Big Datasets (Open-Source)
github.com/MalikHarrisAhm
discuss
2 years ago
mha23
8 points
92.
▲
Our classifier outperforms CatBoost, XGBoost, LightGBM on 5 benchmark datasets
github.com/LinearBoost
5 comments
2 years ago
hamid9
6 points
93.
▲
DatasetGPT – an open-source command line tool for generating datasets with LLMs
github.com/radi-cho
1 comment
3 years ago
radicho123
6 points
94.
▲
Show HN: FiftyOne – Explore, Analyze and Curate Visual Datasets
github.com/voxel51
1 comment
6 years ago
benjaminpkane
6 points
95.
▲
Show HN: Xray: N-D labeled arrays and datasets in Python
github.com/xray
discuss
12 years ago
shoyer
6 points
96.
▲
Show HN: SemHash – Fast Semantic Text Deduplication for Cleaner Datasets
github.com/MinishLab
discuss
a year ago
stephantul
6 points
97.
▲
Show HN: Interactively explore unstructured datasets from your dataframe
github.com/Renumics
discuss
3 years ago
sps44
6 points
98.
▲
Kangas: Pandas for Multimedia Datasets
github.com/comet-ml
discuss
3 years ago
synergy20
6 points
99.
▲
The fastest command-line tools for querying large JSON datasets
github.com/dcmoura
discuss
4 years ago
zX41ZdbW
6 points
100.
▲
Resampling Unbalanced Datasets
github.com/fmfn
discuss
12 years ago
hrb1979
5 points
101.
▲
Curated list of language modeling researches for code, plus related datasets
github.com/codefuse-ai
discuss
a year ago
Bluestein
5 points
102.
▲
Show HN: Byte-Pair Encoding tokenizer for training LLMs on large datasets
github.com/jmaczan
discuss
2 years ago
yu3zhou4
5 points
103.
▲
DataDM – Search and analyze datasets with LLMs
github.com/approximatelabs
discuss
3 years ago
cle
5 points
104.
▲
Show HN: Create APIs for static datasets without writing a single line of code
github.com/roapi
discuss
5 years ago
houqp
5 points
105.
▲
Show HN: Transform Unstructured Data into Usable Datasets
github.com/wizenheimer
1 comment
2 years ago
wizenheimer
4 points
106.
▲
Show HN: pqry – A fast, lightweight CLI tool to diagnose Parquet datasets
github.com/symblic
discuss
5 months ago
setzeno
4 points
107.
▲
Show HN: Lance – Open lakehouse format for multimodal AI datasets
github.com/lance-format
discuss
5 months ago
criexe
4 points
108.
▲
A curated list of global electrical grid maps, datasets and resources
github.com/open-energy-transition
discuss
8 months ago
protontypes
4 points
109.
▲
The Well: A 15TB Collection of Physics Simulation Datasets
github.com/PolymathicAI
discuss
9 months ago
Anon84
4 points
110.
▲
Show HN: Mount remote repositories and datasets managed by Git LFS locally
github.com/git-lfs-fuse
discuss
a year ago
rueian
4 points
111.
▲
Awesome-Twitter-data: A list of Twitter datasets and related resources
github.com/shaypal5
discuss
8 years ago
shaypalachy
4 points
112.
▲
Pypixgrid: generate vector tiles for the exploration of spatio-temporal datasets
translate.googleusercontent.com
discuss
9 years ago
based2
4 points
113.
▲
Show HN: DataBrewer – A CLI-tool to search and discover datasets
github.com/rolando
discuss
9 years ago
darkrho
4 points
114.
▲
Show HN: Create simulated datasets in Python with Simulacrum
github.com/jbrambleDC
discuss
10 years ago
jbrambleDC
4 points
115.
▲
hfsearch: a fast cli tool to discover models and datasets on HuggingFace
github.com/HenokB
1 comment
7 months ago
henok_ademtew
3 points
116.
▲
Show HN: Torque – A declarative, typesafe DSL for LLM training datasets (MIT)
github.com/qforge-dev
1 comment
8 months ago
michalwarda
3 points
117.
▲
Hugging Face AI Sheets, open-source tool to vibe test models on your datasets
github.com/huggingface
1 comment
10 months ago
dvilasuero
3 points
118.
▲
Promptwright: Generate large synthetic datasets using a local LLM
github.com/StacklokLabs
1 comment
2 years ago
trickleup
3 points
119.
▲
Easily convert YouTube, Torrent and Enterprise videos into LLM datasets
github.com/qet-lab
1 comment
2 years ago
m_2018
3 points
120.
▲
UpliftML: An uplift modeling library that handles web scale datasets
github.com/bookingcom
1 comment
5 years ago
TaXxEr
3 points
More