Show HN: Byte-Pair Encoding tokenizer for training LLMs on large datasetsgithub.com/jmaczan5 pointsyu3zhou42 years ago