When submitting Amazon SageMaker training jobs using one of the built-in algorithms, the common parameters that must be specified are:
The training channel identifying the location of training data on an Amazon S3 bucket. This parameter tells SageMaker where to find the input data for the algorithm and what format it is in. For example, TrainingInputMode: File means that the input data is in files stored in S3.
The IAM role that Amazon SageMaker can assume to perform tasks on behalf of the users. This parameter grants SageMaker the necessary permissions to access the S3 buckets, ECR repositories, and other AWS resources needed for the training job. For example, RoleArn: arn:aws:iam::123456789012:role/service-role/AmazonSageMaker-ExecutionRole-20200303T150948 means that SageMaker will use the specified role to run the training job.
The output path specifying where on an Amazon S3 bucket the trained model will persist. This parameter tells SageMaker where to save the model artifacts, such as the model weights and parameters, after the training job is completed. For example, OutputDataConfig: {S3OutputPath: s3://my-bucket/my-training-job} means that SageMaker will store the model artifacts in the specified S3 location.
The validation channel identifying the location of validation data on an Amazon S3 bucket is an optional parameter that can be used to provide a separate dataset for evaluating the model performance during the training process. This parameter is not required for all algorithms and can be omitted if the validation data is not available or not needed.
The hyperparameters in a JSON array as documented for the algorithm used is another optional parameter that can be used to customize the behavior and performance of the algorithm. This parameter is specific to each algorithm and can be used to tune the model accuracy, speed, complexity, and other aspects. For example, HyperParameters: {num_round: "10", objective: "binary:logistic"} means that the XGBoost algorithm will use 10 boosting rounds and the logistic loss function for binary classification.
The Amazon EC2 instance class specifying whether training will be run using CPU or GPU is not a parameter that is specified when submitting a training job using a built-in algorithm. Instead, this parameter is specified when creating a training instance, which is a containerized environment that runs the training code and algorithm. For example, ResourceConfig: {InstanceType: ml.m5.xlarge, InstanceCount: 1, VolumeSizeInGB: 10} means that SageMaker will use one m5.xlarge instance with 10 GB of storage for the training instance.
References:
Train a Model with Amazon SageMaker
Use Amazon SageMaker Built-in Algorithms or Pre-trained Models
CreateTrainingJob - Amazon SageMaker Service