Code for the Byte Pair Encoding algorithm, commonly used in LLM tokenizationgithub.com/karpathy81 pointsmagoghm2 years ago