ProcessingS3Input
Configuration for downloading input data from Amazon S3 into the processing container.
Types
Properties
Whether to GZIP-decompress the data in Amazon S3 as it is streamed into the processing container. Gzip
can only be used when Pipe
mode is specified as the S3InputMode
. In Pipe
mode, Amazon SageMaker streams input data from the source directly to your container without using the EBS volume.
Whether to distribute the data from Amazon S3 to all processing instances with FullyReplicated
, or whether the data from Amazon S3 is shared by Amazon S3 key, downloading one shard of data to each processing instance.
Whether you use an S3Prefix
or a ManifestFile
for the data type. If you choose S3Prefix
, S3Uri
identifies a key name prefix. Amazon SageMaker uses all objects with the specified key name prefix for the processing job. If you choose ManifestFile
, S3Uri
identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for the processing job.
Whether to use File
or Pipe
input mode. In File mode, Amazon SageMaker copies the data from the input source onto the local ML storage volume before starting your processing container. This is the most commonly used input mode. In Pipe
mode, Amazon SageMaker streams input data from the source directly to your processing container into named pipes without using the ML storage volume.