HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
601.
▲
Clojurellm-data: Clojure LLM – Dataset curation for fine tuning an LLM
github.com/ruped
discuss
a year ago
simonpure
2 points
602.
▲
WASD: Wireless Anomaly Signal Dataset
github.com/BK3536
discuss
a year ago
teleforce
2 points
603.
▲
Links: A dataset of a hundred million planar linkage mechanisms
github.com/ahnobari
discuss
a year ago
nill0
2 points
604.
▲
Microsoft releases TRELLIS dataset: 500k 3D assets for model generation training
github.com/microsoft
discuss
a year ago
summarity
2 points
605.
▲
Argilla: Build high quality datasets for your AI models
github.com/argilla-io
discuss
2 years ago
shcheklein
2 points
606.
▲
Fast and scalable dataset preparation and curation tool from Nvidia
github.com/NVIDIA
discuss
2 years ago
shcheklein
2 points
607.
▲
Show HN: Search in HuggingFace Dataset from the URL
github.com/lightonai
discuss
2 years ago
raphaelty
2 points
608.
▲
Roapi: Create APIs for slow moving datasets without writing code
github.com/roapi
discuss
2 years ago
sea-gold
2 points
609.
▲
Reladiff: High-performance diffing of large datasets across databases
github.com/erezsh
discuss
2 years ago
todsacerdoti
2 points
610.
▲
The largest dataset of LLM jailbreak prompts
github.com/verazuo
discuss
2 years ago
titaniumrain
2 points
611.
▲
Microsoft/MS-MARCO-Web-Search: A large-scale information-rich web dataset
github.com/microsoft
discuss
2 years ago
alexmolas
2 points
612.
▲
OpenForest – A catalogue of open access forest datasets
github.com/RolnickLab
discuss
2 years ago
Brajeshwar
2 points
613.
▲
Dataset to extract stock tickers from NL
github.com/rohanmahen
discuss
2 years ago
rohanmahen
2 points
614.
▲
Show HN: Lightly Insights – open-source dataset analysis
github.com/lightly-ai
discuss
3 years ago
isusmelj
2 points
615.
▲
Fabricator – OSS framework to generate datasets with LLMs
github.com/flairNLP
discuss
3 years ago
aantti
2 points
616.
▲
Framework to easily create LLM powered bots over any dataset
github.com/embedchain
discuss
3 years ago
ensocode
2 points
617.
▲
Show HN: A Python toolkit for working with parquet datasets on AWS
github.com/marwan116
discuss
3 years ago
ortamina
2 points
618.
▲
Just in Time Datastructures
github.com/UBOdin
discuss
3 years ago
danny00
2 points
619.
▲
Processing large JSON datasets by streaming
github.com/kashifrazzaqui
discuss
3 years ago
kashif
2 points
620.
▲
RedPajama-Data: Code for preparing large datasets
github.com/togethercomputer
discuss
3 years ago
harrisonpowers
2 points
621.
▲
OpenFEMA Samples – Code, dataset, and analysis samples that utilize OpenFEMA API
github.com/FEMA
discuss
3 years ago
mindcrime
2 points
622.
▲
Benchmark of simple operations against common KV datastores with Python clients
github.com/alisaifee
discuss
3 years ago
indydevs
2 points
623.
▲
Open Source AI Image Classifier with Automatic Dataset Creator
github.com/serpapi
discuss
3 years ago
thefoolofdaath
2 points
624.
▲
Show HN: DescribeML is a VSCode language plugin to describe ML datasets
github.com/SOM-Research
discuss
4 years ago
softmodeling
2 points
625.
▲
Darmok and Jalad at Tanagra: Dataset and Model for English-Tamarian Translation
github.com/cognitiveailab
discuss
4 years ago
darwinwhy
2 points
626.
▲
SimilarVerbBank: Dataset of similar verbs formed with the Apriori algorithm
github.com/nlptechbook
discuss
4 years ago
jxireal
2 points
627.
▲
HuggingFace/evaluate: A library for easily evaluating ML models and datasets
github.com/huggingface
discuss
4 years ago
occamschainsaw
2 points
628.
▲
Open-source motion datasets collected by Bandai Namco Research
github.com/BandaiNamcoResearchInc
discuss
4 years ago
nikolay
2 points
629.
▲
Show HN: Bollywood Lyrics Dataset
github.com/hbdeshmukh
discuss
4 years ago
hdesh
2 points
630.
▲
Ivis: Dimensionality Reduction In Large Datasets Using Siamese Networks
github.com/beringresearch
discuss
5 years ago
optimalsolver
2 points
More