Scaling Language Model Training to a Trillion Parameters Using Megatrondeveloper.nvidia.com2 pointsdoener5 years ago