Simple guide to training Llama 2 with AWS Trainium on Amazon SageMaker
AWS Machine Learning - AI
MAY 1, 2024
Automatic checkpointing – Checkpoints from a local path (/opt/ml/checkpoints by default) are automatically copied to an Amazon Simple Storage Service (Amazon S3) location specified by the user. After the SageMaker estimator has completed the training job, you can locate the new checkpoint in the S3 checkpoint directory containing the weights.
Let's personalize your content