S3DataSource (AWS SDK for Java

java.lang.Object
- software.amazon.awssdk.services.sagemaker.model.S3DataSource

All Implemented Interfaces:: Serializable, SdkPojo, ToCopyableBuilder<S3DataSource.Builder,S3DataSource>

@Generated(value="software.amazon.awssdk:codegen")
public final class S3DataSource
extends Object
implements SdkPojo, Serializable, ToCopyableBuilder<S3DataSource.Builder,S3DataSource>

Describes the S3 data source.

See Also:: Serialized Form

Nested Class Summary

Nested Classes
Modifier and Type Class and Description

static interface S3DataSource.Builder

Nested Classes
Modifier and Type	Class and Description
`static interface`	`S3DataSource.Builder`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`List<String>`	`attributeNames()` A list of one or more attribute names to use that are found in a specified augmented manifest file.
`static S3DataSource.Builder`	`builder()`
`boolean`	`equals(Object obj)`
`<T> Optional<T>`	`getValueForField(String fieldName, Class<T> clazz)`
`int`	`hashCode()`
`S3DataDistribution`	`s3DataDistributionType()` If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify `FullyReplicated`.
`String`	`s3DataDistributionTypeAsString()` If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify `FullyReplicated`.
`S3DataType`	`s3DataType()` If you choose `S3Prefix`, `S3Uri` identifies a key name prefix.
`String`	`s3DataTypeAsString()` If you choose `S3Prefix`, `S3Uri` identifies a key name prefix.
`String`	`s3Uri()` Depending on the value specified for the `S3DataType`, identifies either a key name prefix or a manifest.
`List<SdkField<?>>`	`sdkFields()`
`static Class<? extends S3DataSource.Builder>`	`serializableBuilderClass()`
`S3DataSource.Builder`	`toBuilder()` Take this object and create a builder that contains all of the current property values of this object.
`String`	`toString()`

Methods inherited from class java.lang.Object
getClass, notify, notifyAll, wait, wait, wait

Methods inherited from interface software.amazon.awssdk.utils.builder.ToCopyableBuilder
copy

- Method Detail
  - s3DataType
```
public S3DataType s3DataType()
```
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects that match the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest file in JSON lines format. This file contains the data you want to use for model training. AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
    
    If the service returns an enum value that is not available in the current SDK version, s3DataType will return S3DataType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from s3DataTypeAsString().
    
    Returns:
    
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects that match the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest file in JSON lines format. This file contains the data you want to use for model training. AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
    
    See Also:
    
    S3DataType
  - s3DataTypeAsString
```
public String s3DataTypeAsString()
```
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects that match the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest file in JSON lines format. This file contains the data you want to use for model training. AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
    
    If the service returns an enum value that is not available in the current SDK version, s3DataType will return S3DataType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from s3DataTypeAsString().
    
    Returns:
    
    If you choose S3Prefix, S3Uri identifies a key name prefix. Amazon SageMaker uses all objects that match the specified key name prefix for model training.
    
    If you choose ManifestFile, S3Uri identifies an object that is a manifest file containing a list of object keys that you want Amazon SageMaker to use for model training.
    
    If you choose AugmentedManifestFile, S3Uri identifies an object that is an augmented manifest file in JSON lines format. This file contains the data you want to use for model training. AugmentedManifestFile can only be used if the Channel's input mode is Pipe.
    
    See Also:
    
    S3DataType
  - s3Uri
```
public String s3Uri()
```
    Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest. For example:
    - A key name prefix might look like this: s3://bucketname/exampleprefix.
    - A manifest might look like this: s3://bucketname/example.manifest
      
      The manifest is an S3 object which is a JSON file with the following format:
      
      [
      
      {"prefix": "s3://customer_bucket/some/prefix/"},
      
      "relative/path/to/custdata-1",
      
      "relative/path/custdata-2",
      
      ...
      
      ]
      
      The preceding JSON matches the following s3Uris:
      
      s3://customer_bucket/some/prefix/relative/path/to/custdata-1
      
      s3://customer_bucket/some/prefix/relative/path/custdata-1
      
      ...
      
      The complete set of s3uris in this manifest is the input data for the channel for this datasource. The object that each s3uris points to must be readable by the IAM role that Amazon SageMaker uses to perform tasks on your behalf.
    Returns:
    Depending on the value specified for the S3DataType, identifies either a key name prefix or a manifest. For example:
    
    A key name prefix might look like this: s3://bucketname/exampleprefix.
    
    A manifest might look like this: s3://bucketname/example.manifest
    
    The manifest is an S3 object which is a JSON file with the following format:
    
    [
    
    {"prefix": "s3://customer_bucket/some/prefix/"},
    
    "relative/path/to/custdata-1",
    
    "relative/path/custdata-2",
    
    ...
    
    ]
    
    The preceding JSON matches the following s3Uris:
    
    s3://customer_bucket/some/prefix/relative/path/to/custdata-1
    
    s3://customer_bucket/some/prefix/relative/path/custdata-1
    
    ...
    
    The complete set of s3uris in this manifest is the input data for the channel for this datasource. The object that each s3uris points to must be readable by the IAM role that Amazon SageMaker uses to perform tasks on your behalf.
  - s3DataDistributionType
```
public S3DataDistribution s3DataDistributionType()
```
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipemodes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    If the service returns an enum value that is not available in the current SDK version, s3DataDistributionType will return S3DataDistribution.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from s3DataDistributionTypeAsString().
    
    Returns:
    
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipemodes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    See Also:
    
    S3DataDistribution
  - s3DataDistributionTypeAsString
```
public String s3DataDistributionTypeAsString()
```
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipemodes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    If the service returns an enum value that is not available in the current SDK version, s3DataDistributionType will return S3DataDistribution.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from s3DataDistributionTypeAsString().
    
    Returns:
    
    If you want Amazon SageMaker to replicate the entire dataset on each ML compute instance that is launched for model training, specify FullyReplicated.
    
    If you want Amazon SageMaker to replicate a subset of data on each ML compute instance that is launched for model training, specify ShardedByS3Key. If there are n ML compute instances launched for a training job, each instance gets approximately 1/n of the number of S3 objects. In this case, model training on each machine uses only the subset of training data.
    
    Don't choose more ML compute instances for training than available S3 objects. If you do, some nodes won't get any data and you will pay for nodes that aren't getting any training data. This applies in both File and Pipemodes. Keep this in mind when developing algorithms.
    
    In distributed training, where you use multiple ML compute EC2 instances, you might choose ShardedByS3Key. If the algorithm requires copying training data to the ML storage volume (when TrainingInputMode is set to File), this copies 1/n of the number of objects.
    
    See Also:
    
    S3DataDistribution
  - attributeNames
```
public List<String> attributeNames()
```
    A list of one or more attribute names to use that are found in a specified augmented manifest file.
    
    Attempts to modify the collection returned by this method will result in an UnsupportedOperationException.
    
    Returns:
    
    A list of one or more attribute names to use that are found in a specified augmented manifest file.
  - toBuilder
```
public S3DataSource.Builder toBuilder()
```
    Description copied from interface: ToCopyableBuilder
    
    Take this object and create a builder that contains all of the current property values of this object.
    
    Specified by:
    
    toBuilder in interface ToCopyableBuilder<S3DataSource.Builder,S3DataSource>
    
    Returns:
    
    a builder for type T
  - builder
```
public static S3DataSource.Builder builder()
```
  - serializableBuilderClass
```
public static Class<? extends S3DataSource.Builder> serializableBuilderClass()
```
  - hashCode
```
public int hashCode()
```
    Overrides:
    
    hashCode in class Object
  - equals
```
public boolean equals(Object obj)
```
    Overrides:
    
    equals in class Object
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object
  - getValueForField
```
public <T> Optional<T> getValueForField(String fieldName,
                                        Class<T> clazz)
```
  - sdkFields
```
public List<SdkField<?>> sdkFields()
```
    Specified by:
    
    sdkFields in interface SdkPojo
    
    Returns:
    
    List of SdkField in this POJO. May be empty list but should never be null.

Class S3DataSource

Nested Class Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface software.amazon.awssdk.utils.builder.ToCopyableBuilder

Method Detail

s3DataType

s3DataTypeAsString

s3Uri

s3DataDistributionType

s3DataDistributionTypeAsString

attributeNames

toBuilder

builder

serializableBuilderClass

hashCode

equals

toString

getValueForField

sdkFields