RedPajama-Data: Code for preparing large datasetsgithub.com/togethercomputer2 pointsharrisonpowers3 years ago