MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPUarxiv.org326 pointschrsw3 months ago