HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
181.
▲
DataChain: Prepare and curate datasets for AI/ML
github.com/iterative
discuss
2 years ago
shcheklein
2 points
182.
▲
Roapi: Create APIs for slow moving datasets without writing code
github.com/roapi
discuss
2 years ago
sea-gold
2 points
183.
▲
Reladiff: High-performance diffing of large datasets across databases
github.com/erezsh
discuss
2 years ago
todsacerdoti
2 points
184.
▲
OpenForest – A catalogue of open access forest datasets
github.com/RolnickLab
discuss
2 years ago
Brajeshwar
2 points
185.
▲
Fabricator – OSS framework to generate datasets with LLMs
github.com/flairNLP
discuss
3 years ago
aantti
2 points
186.
▲
Show HN: A Python toolkit for working with parquet datasets on AWS
github.com/marwan116
discuss
3 years ago
ortamina
2 points
187.
▲
Processing large JSON datasets by streaming
github.com/kashifrazzaqui
discuss
3 years ago
kashif
2 points
188.
▲
RedPajama-Data: Code for preparing large datasets
github.com/togethercomputer
discuss
3 years ago
harrisonpowers
2 points
189.
▲
Show HN: DescribeML is a VSCode language plugin to describe ML datasets
github.com/SOM-Research
discuss
4 years ago
softmodeling
2 points
190.
▲
HuggingFace/evaluate: A library for easily evaluating ML models and datasets
github.com/huggingface
discuss
4 years ago
occamschainsaw
2 points
191.
▲
Open-source motion datasets collected by Bandai Namco Research
github.com/BandaiNamcoResearchInc
discuss
4 years ago
nikolay
2 points
192.
▲
Ivis: Dimensionality Reduction In Large Datasets Using Siamese Networks
github.com/beringresearch
discuss
5 years ago
optimalsolver
2 points
193.
▲
Gretel-synthetics: open-source library to create synthetic datasets
github.com/gretelai
discuss
5 years ago
meowterspace42
2 points
194.
▲
Witch-Trials: Datasets and Code for “Witch Trials” (Leeson and Russ 2018)
github.com/JakeRuss
discuss
6 years ago
DyslexicAtheist
2 points
195.
▲
Sweetviz: Visualize and compare datasets, target values and associations
github.com/fbdesignpro
discuss
6 years ago
polm23
2 points
196.
▲
Datasets and Evaluation Metrics for NLP (True Open Source GPT Alternative)
github.com/huggingface
discuss
6 years ago
dragonsh
2 points
197.
▲
Datasets and evaluation metrics for natural language processing(NLP)
github.com/huggingface
discuss
6 years ago
dragonsh
2 points
198.
▲
Datasets and Evaluation Metrics for Natural Language Processing (NLP)
github.com/huggingface
discuss
6 years ago
dragonsh
2 points
199.
▲
Show HN: A CLI tool for maintaining datasets in a centralized repository
github.com/ezhou7
discuss
7 years ago
nightrunner11
2 points
200.
▲
Library to scrape and clean web pages to create datasets
github.com/chiphuyen
discuss
7 years ago
khartig
2 points
201.
▲
Lazynlp: Library to scrape and clean web pages to create datasets
github.com/chiphuyen
discuss
7 years ago
Osiris30
2 points
202.
▲
Lazynlp: A library to scrape, clean, de-duplicate webpages to create datasets
github.com/chiphuyen
discuss
7 years ago
korym
2 points
203.
▲
Show HN: Python Script to Generate Fake Datasets for Testing ML/DL Workflows
github.com/minimaxir
discuss
7 years ago
minimaxir
2 points
204.
▲
Open source tool for merging datasets
github.com/funkeinteraktiv
discuss
8 years ago
chrtze
2 points
205.
▲
Tracking progress in NLP tasks and datasets
github.com/sebastianruder
discuss
8 years ago
neuhaus
2 points
206.
▲
Chatito – Generate training datasets for slot filling chatbots in a breeze
github.com/rodrigopivi
discuss
9 years ago
prodrod
2 points
207.
▲
Working with datasets in Clojure: select,where,aggregate,join,order,crosstab,etc
github.com/emiruz
discuss
9 years ago
usgroup
2 points
208.
▲
OpenRefine – assess the quality of datasets
github.com/OpenRefine
discuss
9 years ago
chirau
2 points
209.
▲
Datasets on fee-based open access publishing
github.com/OpenAPC
discuss
10 years ago
Erikun
2 points
210.
▲
Show HN: Download UCI datasets with python
gist.github.com
discuss
10 years ago
thewhitetulip
2 points
More