HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
121.
▲
100K Fake US People Profiles Dataset
github.com/marko-simic
discuss
4 years ago
qa-guy
1 points
122.
▲
An analysis of 7M NFT transactions on the Ethereum blockchain [pdf]
github.com/bugout-dev
discuss
5 years ago
mpaepper
1 points
123.
▲
Launch HN: Activeloop (YC S18) – Data lake for deep learning
24 comments
4 years ago
davidbuniat
64 points
124.
▲
Ask HN: How are you extracting the best performance out of your RAG pipeline?
4 comments
2 years ago
imaravind
5 points
125.
▲
Show HN: I built an open-source financial research terminal (SEC data and SQL)
terminal.tesseractanalytics.ai
discuss
8 days ago
tessbi
5 points
126.
▲
Lip2Wav: Synthesize Speech Only from the Lip Movements
discuss
6 years ago
prajwalkr
4 points
127.
▲
Show HN: SJT- A lightweight structured JSON table format for APIs
1 comment
10 months ago
yukiakai
3 points
128.
▲
InfoSeek: The First Open-Source Framework for Deep Research Data Synthesis
1 comment
9 months ago
BAAIBeijing
2 points
129.
▲
Show HN: RandomForestGenerator – CSV to ML in the browser, but local
jonaraphael.github.io
discuss
5 months ago
jonaraphael
2 points
130.
▲
Measuring Compositional Generalization in ML Architectures
discuss
6 years ago
esdee
1 points
131.
▲
Free/Open Source Datasets
github.com/rasbt
discuss
11 years ago
rouma7
2 points
132.
▲
Satellite Image Time Series Datasets
github.com/corentin-dfg
discuss
3 years ago
sebg
2 points
133.
▲
Show HN: Simple Python script to split (DL)training data (CNNs mainly)
github.com/chinmayshah99
discuss
7 years ago
chinmays
2 points
134.
▲
Chinese Language Corpora for Sentiment Analysis
github.com/Lab41
discuss
8 years ago
ghosthamlet
1 points
135.
▲
Show HN: Open Prompts – dataset of 10M Stable Diffusion generations
github.com/krea-ai
71 comments
4 years ago
vipermu
279 points
136.
▲
Tell HN: Full Hacker News dataset now available on BigQuery
43 comments
11 years ago
minimaxir
238 points
137.
▲
Dat – Distributed Dataset Synchronization and Versioning
github.com/datproject
39 comments
9 years ago
ColinWright
229 points
138.
▲
A multimodal dataset with one trillion tokens
github.com/mlfoundations
52 comments
2 years ago
kulikalov
224 points
139.
▲
An MNIST-like fashion product dataset
github.com/zalandoresearch
21 comments
9 years ago
kashifr
220 points
140.
▲
Qri: A global dataset version control system built on the distributed web
github.com/qri-io
42 comments
7 years ago
anewhnaccount2
204 points
141.
▲
Visualizations for machine learning datasets
github.com/PAIR-code
7 comments
9 years ago
happy-go-lucky
178 points
142.
▲
Finetuning of Falcon-7B LLM Using QLoRA on Mental Health Conversational Dataset
github.com/iamarunbrahma
108 comments
3 years ago
iamarunbrahma
160 points
143.
▲
Hypersim, Photorealistic Synthetic Dataset for Indoor Scene Understanding
github.com/apple
20 comments
6 years ago
homarp
122 points
144.
▲
Show HN: Dlt – Python library to automate the creation of datasets
colab.research.google.com
54 comments
3 years ago
MatthausK
114 points
145.
▲
Driving dataset for car autopilot AI training
github.com/commaai
44 comments
10 years ago
EvgeniyZh
100 points
146.
▲
Boston housing price dataset was removed from scikit-learn 1.2
github.com/scikit-learn
84 comments
3 years ago
ok123456
81 points
147.
▲
RipTable – multi-threaded Python data analytics tools for numpy arrays/datasets
github.com/rtosholdings
14 comments
6 years ago
aldanor
79 points
148.
▲
Show HN: Hyperparam: OSS tools for exploring datasets locally in the browser
hyperparam.app
21 comments
a year ago
platypii
77 points
149.
▲
Comma2k19 – A dataset of over 33 hours of commute in California's 280 highway
github.com/commaai
35 comments
8 years ago
pd0wm
70 points
150.
▲
How to query data.gov json datasets with SQL: a case study
github.com/axibase
1 comment
10 years ago
rodionos
68 points
More