Class S3Settings
- All Implemented Interfaces:
Serializable
,SdkPojo
,ToCopyableBuilder<S3Settings.Builder,
S3Settings>
Settings for exporting data to Amazon S3.
- See Also:
-
Nested Class Summary
Nested Classes -
Method Summary
Modifier and TypeMethodDescriptionfinal Boolean
An optional parameter that, when set totrue
ory
, you can use to add column name information to the .csv output file.final Boolean
Use the S3 target endpoint settingAddTrailingPaddingCharacter
to add padding on string data.final String
An optional parameter to set a folder name in the S3 bucket.final String
The name of the S3 bucket.static S3Settings.Builder
builder()
final CannedAclForObjectsValue
A value that enables DMS to specify a predefined (canned) access control list for objects created in an Amazon S3 bucket as .csv or .parquet files.final String
A value that enables DMS to specify a predefined (canned) access control list for objects created in an Amazon S3 bucket as .csv or .parquet files.final Boolean
A value that enables a change data capture (CDC) load to write INSERT and UPDATE operations to .csv or .parquet (columnar storage) output files.final Boolean
A value that enables a change data capture (CDC) load to write only INSERT operations to .csv or columnar storage (.parquet) output files.final Integer
Maximum length of the interval, defined in seconds, after which to output a file to Amazon S3.final Integer
Minimum file size, defined in kilobytes, to reach for a file output to Amazon S3.final String
cdcPath()
Specifies the folder path of CDC files.final CompressionTypeValue
An optional parameter to use GZIP to compress the target files.final String
An optional parameter to use GZIP to compress the target files.final String
The delimiter used to separate columns in the .csv file for both source and target.final String
This setting only applies if your Amazon S3 output files during a change data capture (CDC) load are written in .csv format.final String
An optional parameter that specifies how DMS treats null values.final String
The delimiter used to separate rows in the .csv file for both source and target.final DataFormatValue
The format of the data that you want to use for output.final String
The format of the data that you want to use for output.final Integer
The size of one data page in bytes.Specifies a date separating delimiter to use during folder partitioning.final String
Specifies a date separating delimiter to use during folder partitioning.final Boolean
When set totrue
, this parameter partitions S3 bucket folders based on transaction commit dates.Identifies the sequence of the date format to use during folder partitioning.final String
Identifies the sequence of the date format to use during folder partitioning.final String
When creating an S3 target endpoint, setDatePartitionTimezone
to convert the current UTC time into a specified time zone.final Integer
The maximum size of an encoded dictionary page of a column.final Boolean
A value that enables statistics for Parquet pages and row groups.final EncodingTypeValue
The type of encoding you are using:final String
The type of encoding you are using:final EncryptionModeValue
The type of server-side encryption that you want to use for your data.final String
The type of server-side encryption that you want to use for your data.final boolean
final boolean
equalsBySdkFields
(Object obj) Indicates whether some other object is "equal to" this one by SDK fields.final String
To specify a bucket owner and prevent sniping, you can use theExpectedBucketOwner
endpoint setting.final String
Specifies how tables are defined in the S3 source files only.final <T> Optional
<T> getValueForField
(String fieldName, Class<T> clazz) final Boolean
When true, allows Glue to catalog your S3 bucket.final int
hashCode()
final Integer
When this value is set to 1, DMS ignores the first row header in a .csv file.final Boolean
A value that enables a full load to write INSERT operations to the comma-separated value (.csv) or .parquet output files only to indicate how the rows were added to the source database.final Integer
A value that specifies the maximum size (in KB) of any .csv file to be created while migrating to an S3 target during full load.final Boolean
A value that specifies the precision of anyTIMESTAMP
column values that are written to an Amazon S3 object file in .parquet format.final ParquetVersionValue
The version of the Apache Parquet format that you want to use:parquet_1_0
(the default) orparquet_2_0
.final String
The version of the Apache Parquet format that you want to use:parquet_1_0
(the default) orparquet_2_0
.final Boolean
If set totrue
, DMS saves the transaction order for a change data capture (CDC) load on the Amazon S3 target specified byCdcPath
.final Boolean
rfc4180()
For an S3 source, when this value is set totrue
ory
, each leading double quotation mark has to be followed by an ending double quotation mark.final Integer
The number of rows in a row group.static Class
<? extends S3Settings.Builder> final String
If you are usingSSE_KMS
for theEncryptionMode
, provide the KMS key ID.final String
The Amazon Resource Name (ARN) used by the service to access the IAM role.final String
A value that when nonblank causes DMS to add a column with timestamp information to the endpoint data for an Amazon S3 target.Take this object and create a builder that contains all of the current property values of this object.final String
toString()
Returns a string representation of this object.final Boolean
This setting applies if the S3 output files during a change data capture (CDC) load are written in .csv format.final Boolean
When set to true, this parameter uses the task start time as the timestamp column value instead of the time data is written to target.Methods inherited from interface software.amazon.awssdk.utils.builder.ToCopyableBuilder
copy
-
Method Details
-
serviceAccessRoleArn
The Amazon Resource Name (ARN) used by the service to access the IAM role. The role must allow the
iam:PassRole
action. It is a required parameter that enables DMS to write and read objects from an S3 bucket.- Returns:
- The Amazon Resource Name (ARN) used by the service to access the IAM role. The role must allow the
iam:PassRole
action. It is a required parameter that enables DMS to write and read objects from an S3 bucket.
-
externalTableDefinition
Specifies how tables are defined in the S3 source files only.
- Returns:
- Specifies how tables are defined in the S3 source files only.
-
csvRowDelimiter
The delimiter used to separate rows in the .csv file for both source and target. The default is a carriage return (
\n
).- Returns:
- The delimiter used to separate rows in the .csv file for both source and target. The default is a
carriage return (
\n
).
-
csvDelimiter
The delimiter used to separate columns in the .csv file for both source and target. The default is a comma.
- Returns:
- The delimiter used to separate columns in the .csv file for both source and target. The default is a comma.
-
bucketFolder
An optional parameter to set a folder name in the S3 bucket. If provided, tables are created in the path
bucketFolder/schema_name/table_name/
. If this parameter isn't specified, then the path used isschema_name/table_name/
.- Returns:
- An optional parameter to set a folder name in the S3 bucket. If provided, tables are created in the path
bucketFolder/schema_name/table_name/
. If this parameter isn't specified, then the path used isschema_name/table_name/
.
-
bucketName
-
compressionType
An optional parameter to use GZIP to compress the target files. Set to GZIP to compress the target files. Either set this parameter to NONE (the default) or don't use it to leave the files uncompressed. This parameter applies to both .csv and .parquet file formats.
If the service returns an enum value that is not available in the current SDK version,
compressionType
will returnCompressionTypeValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromcompressionTypeAsString()
.- Returns:
- An optional parameter to use GZIP to compress the target files. Set to GZIP to compress the target files. Either set this parameter to NONE (the default) or don't use it to leave the files uncompressed. This parameter applies to both .csv and .parquet file formats.
- See Also:
-
compressionTypeAsString
An optional parameter to use GZIP to compress the target files. Set to GZIP to compress the target files. Either set this parameter to NONE (the default) or don't use it to leave the files uncompressed. This parameter applies to both .csv and .parquet file formats.
If the service returns an enum value that is not available in the current SDK version,
compressionType
will returnCompressionTypeValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromcompressionTypeAsString()
.- Returns:
- An optional parameter to use GZIP to compress the target files. Set to GZIP to compress the target files. Either set this parameter to NONE (the default) or don't use it to leave the files uncompressed. This parameter applies to both .csv and .parquet file formats.
- See Also:
-
encryptionMode
The type of server-side encryption that you want to use for your data. This encryption type is part of the endpoint settings or the extra connections attributes for Amazon S3. You can choose either
SSE_S3
(the default) orSSE_KMS
.For the
ModifyEndpoint
operation, you can change the existing value of theEncryptionMode
parameter fromSSE_KMS
toSSE_S3
. But you can’t change the existing value fromSSE_S3
toSSE_KMS
.To use
SSE_S3
, you need an Identity and Access Management (IAM) role with permission to allow"arn:aws:s3:::dms-*"
to use the following actions:-
s3:CreateBucket
-
s3:ListBucket
-
s3:DeleteBucket
-
s3:GetBucketLocation
-
s3:GetObject
-
s3:PutObject
-
s3:DeleteObject
-
s3:GetObjectVersion
-
s3:GetBucketPolicy
-
s3:PutBucketPolicy
-
s3:DeleteBucketPolicy
If the service returns an enum value that is not available in the current SDK version,
encryptionMode
will returnEncryptionModeValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromencryptionModeAsString()
.- Returns:
- The type of server-side encryption that you want to use for your data. This encryption type is part of
the endpoint settings or the extra connections attributes for Amazon S3. You can choose either
SSE_S3
(the default) orSSE_KMS
.For the
ModifyEndpoint
operation, you can change the existing value of theEncryptionMode
parameter fromSSE_KMS
toSSE_S3
. But you can’t change the existing value fromSSE_S3
toSSE_KMS
.To use
SSE_S3
, you need an Identity and Access Management (IAM) role with permission to allow"arn:aws:s3:::dms-*"
to use the following actions:-
s3:CreateBucket
-
s3:ListBucket
-
s3:DeleteBucket
-
s3:GetBucketLocation
-
s3:GetObject
-
s3:PutObject
-
s3:DeleteObject
-
s3:GetObjectVersion
-
s3:GetBucketPolicy
-
s3:PutBucketPolicy
-
s3:DeleteBucketPolicy
-
- See Also:
-
-
encryptionModeAsString
The type of server-side encryption that you want to use for your data. This encryption type is part of the endpoint settings or the extra connections attributes for Amazon S3. You can choose either
SSE_S3
(the default) orSSE_KMS
.For the
ModifyEndpoint
operation, you can change the existing value of theEncryptionMode
parameter fromSSE_KMS
toSSE_S3
. But you can’t change the existing value fromSSE_S3
toSSE_KMS
.To use
SSE_S3
, you need an Identity and Access Management (IAM) role with permission to allow"arn:aws:s3:::dms-*"
to use the following actions:-
s3:CreateBucket
-
s3:ListBucket
-
s3:DeleteBucket
-
s3:GetBucketLocation
-
s3:GetObject
-
s3:PutObject
-
s3:DeleteObject
-
s3:GetObjectVersion
-
s3:GetBucketPolicy
-
s3:PutBucketPolicy
-
s3:DeleteBucketPolicy
If the service returns an enum value that is not available in the current SDK version,
encryptionMode
will returnEncryptionModeValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromencryptionModeAsString()
.- Returns:
- The type of server-side encryption that you want to use for your data. This encryption type is part of
the endpoint settings or the extra connections attributes for Amazon S3. You can choose either
SSE_S3
(the default) orSSE_KMS
.For the
ModifyEndpoint
operation, you can change the existing value of theEncryptionMode
parameter fromSSE_KMS
toSSE_S3
. But you can’t change the existing value fromSSE_S3
toSSE_KMS
.To use
SSE_S3
, you need an Identity and Access Management (IAM) role with permission to allow"arn:aws:s3:::dms-*"
to use the following actions:-
s3:CreateBucket
-
s3:ListBucket
-
s3:DeleteBucket
-
s3:GetBucketLocation
-
s3:GetObject
-
s3:PutObject
-
s3:DeleteObject
-
s3:GetObjectVersion
-
s3:GetBucketPolicy
-
s3:PutBucketPolicy
-
s3:DeleteBucketPolicy
-
- See Also:
-
-
serverSideEncryptionKmsKeyId
If you are using
SSE_KMS
for theEncryptionMode
, provide the KMS key ID. The key that you use needs an attached policy that enables Identity and Access Management (IAM) user permissions and allows use of the key.Here is a CLI example:
aws dms create-endpoint --endpoint-identifier value --endpoint-type target --engine-name s3 --s3-settings ServiceAccessRoleArn=value,BucketFolder=value,BucketName=value,EncryptionMode=SSE_KMS,ServerSideEncryptionKmsKeyId=value
- Returns:
- If you are using
SSE_KMS
for theEncryptionMode
, provide the KMS key ID. The key that you use needs an attached policy that enables Identity and Access Management (IAM) user permissions and allows use of the key.Here is a CLI example:
aws dms create-endpoint --endpoint-identifier value --endpoint-type target --engine-name s3 --s3-settings ServiceAccessRoleArn=value,BucketFolder=value,BucketName=value,EncryptionMode=SSE_KMS,ServerSideEncryptionKmsKeyId=value
-
dataFormat
The format of the data that you want to use for output. You can choose one of the following:
-
csv
: This is a row-based file format with comma-separated values (.csv). -
parquet
: Apache Parquet (.parquet) is a columnar storage file format that features efficient compression and provides faster query response.
If the service returns an enum value that is not available in the current SDK version,
dataFormat
will returnDataFormatValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromdataFormatAsString()
.- Returns:
- The format of the data that you want to use for output. You can choose one of the following:
-
csv
: This is a row-based file format with comma-separated values (.csv). -
parquet
: Apache Parquet (.parquet) is a columnar storage file format that features efficient compression and provides faster query response.
-
- See Also:
-
-
dataFormatAsString
The format of the data that you want to use for output. You can choose one of the following:
-
csv
: This is a row-based file format with comma-separated values (.csv). -
parquet
: Apache Parquet (.parquet) is a columnar storage file format that features efficient compression and provides faster query response.
If the service returns an enum value that is not available in the current SDK version,
dataFormat
will returnDataFormatValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromdataFormatAsString()
.- Returns:
- The format of the data that you want to use for output. You can choose one of the following:
-
csv
: This is a row-based file format with comma-separated values (.csv). -
parquet
: Apache Parquet (.parquet) is a columnar storage file format that features efficient compression and provides faster query response.
-
- See Also:
-
-
encodingType
The type of encoding you are using:
-
RLE_DICTIONARY
uses a combination of bit-packing and run-length encoding to store repeated values more efficiently. This is the default. -
PLAIN
doesn't use encoding at all. Values are stored as they are. -
PLAIN_DICTIONARY
builds a dictionary of the values encountered in a given column. The dictionary is stored in a dictionary page for each column chunk.
If the service returns an enum value that is not available in the current SDK version,
encodingType
will returnEncodingTypeValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromencodingTypeAsString()
.- Returns:
- The type of encoding you are using:
-
RLE_DICTIONARY
uses a combination of bit-packing and run-length encoding to store repeated values more efficiently. This is the default. -
PLAIN
doesn't use encoding at all. Values are stored as they are. -
PLAIN_DICTIONARY
builds a dictionary of the values encountered in a given column. The dictionary is stored in a dictionary page for each column chunk.
-
- See Also:
-
-
encodingTypeAsString
The type of encoding you are using:
-
RLE_DICTIONARY
uses a combination of bit-packing and run-length encoding to store repeated values more efficiently. This is the default. -
PLAIN
doesn't use encoding at all. Values are stored as they are. -
PLAIN_DICTIONARY
builds a dictionary of the values encountered in a given column. The dictionary is stored in a dictionary page for each column chunk.
If the service returns an enum value that is not available in the current SDK version,
encodingType
will returnEncodingTypeValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromencodingTypeAsString()
.- Returns:
- The type of encoding you are using:
-
RLE_DICTIONARY
uses a combination of bit-packing and run-length encoding to store repeated values more efficiently. This is the default. -
PLAIN
doesn't use encoding at all. Values are stored as they are. -
PLAIN_DICTIONARY
builds a dictionary of the values encountered in a given column. The dictionary is stored in a dictionary page for each column chunk.
-
- See Also:
-
-
dictPageSizeLimit
The maximum size of an encoded dictionary page of a column. If the dictionary page exceeds this, this column is stored using an encoding type of
PLAIN
. This parameter defaults to 1024 * 1024 bytes (1 MiB), the maximum size of a dictionary page before it reverts toPLAIN
encoding. This size is used for .parquet file format only.- Returns:
- The maximum size of an encoded dictionary page of a column. If the dictionary page exceeds this, this
column is stored using an encoding type of
PLAIN
. This parameter defaults to 1024 * 1024 bytes (1 MiB), the maximum size of a dictionary page before it reverts toPLAIN
encoding. This size is used for .parquet file format only.
-
rowGroupLength
The number of rows in a row group. A smaller row group size provides faster reads. But as the number of row groups grows, the slower writes become. This parameter defaults to 10,000 rows. This number is used for .parquet file format only.
If you choose a value larger than the maximum,
RowGroupLength
is set to the max row group length in bytes (64 * 1024 * 1024).- Returns:
- The number of rows in a row group. A smaller row group size provides faster reads. But as the number of
row groups grows, the slower writes become. This parameter defaults to 10,000 rows. This number is used
for .parquet file format only.
If you choose a value larger than the maximum,
RowGroupLength
is set to the max row group length in bytes (64 * 1024 * 1024).
-
dataPageSize
The size of one data page in bytes. This parameter defaults to 1024 * 1024 bytes (1 MiB). This number is used for .parquet file format only.
- Returns:
- The size of one data page in bytes. This parameter defaults to 1024 * 1024 bytes (1 MiB). This number is used for .parquet file format only.
-
parquetVersion
The version of the Apache Parquet format that you want to use:
parquet_1_0
(the default) orparquet_2_0
.If the service returns an enum value that is not available in the current SDK version,
parquetVersion
will returnParquetVersionValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromparquetVersionAsString()
.- Returns:
- The version of the Apache Parquet format that you want to use:
parquet_1_0
(the default) orparquet_2_0
. - See Also:
-
parquetVersionAsString
The version of the Apache Parquet format that you want to use:
parquet_1_0
(the default) orparquet_2_0
.If the service returns an enum value that is not available in the current SDK version,
parquetVersion
will returnParquetVersionValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromparquetVersionAsString()
.- Returns:
- The version of the Apache Parquet format that you want to use:
parquet_1_0
(the default) orparquet_2_0
. - See Also:
-
enableStatistics
A value that enables statistics for Parquet pages and row groups. Choose
true
to enable statistics,false
to disable. Statistics includeNULL
,DISTINCT
,MAX
, andMIN
values. This parameter defaults totrue
. This value is used for .parquet file format only.- Returns:
- A value that enables statistics for Parquet pages and row groups. Choose
true
to enable statistics,false
to disable. Statistics includeNULL
,DISTINCT
,MAX
, andMIN
values. This parameter defaults totrue
. This value is used for .parquet file format only.
-
includeOpForFullLoad
A value that enables a full load to write INSERT operations to the comma-separated value (.csv) or .parquet output files only to indicate how the rows were added to the source database.
DMS supports the
IncludeOpForFullLoad
parameter in versions 3.1.4 and later.DMS supports the use of the .parquet files with the
IncludeOpForFullLoad
parameter in versions 3.4.7 and later.For full load, records can only be inserted. By default (the
false
setting), no information is recorded in these output files for a full load to indicate that the rows were inserted at the source database. IfIncludeOpForFullLoad
is set totrue
ory
, the INSERT is recorded as an I annotation in the first field of the .csv file. This allows the format of your target records from a full load to be consistent with the target records from a CDC load.This setting works together with the
CdcInsertsOnly
and theCdcInsertsAndUpdates
parameters for output to .csv files only. For more information about how these settings work together, see Indicating Source DB Operations in Migrated S3 Data in the Database Migration Service User Guide..- Returns:
- A value that enables a full load to write INSERT operations to the comma-separated value (.csv) or
.parquet output files only to indicate how the rows were added to the source database.
DMS supports the
IncludeOpForFullLoad
parameter in versions 3.1.4 and later.DMS supports the use of the .parquet files with the
IncludeOpForFullLoad
parameter in versions 3.4.7 and later.For full load, records can only be inserted. By default (the
false
setting), no information is recorded in these output files for a full load to indicate that the rows were inserted at the source database. IfIncludeOpForFullLoad
is set totrue
ory
, the INSERT is recorded as an I annotation in the first field of the .csv file. This allows the format of your target records from a full load to be consistent with the target records from a CDC load.This setting works together with the
CdcInsertsOnly
and theCdcInsertsAndUpdates
parameters for output to .csv files only. For more information about how these settings work together, see Indicating Source DB Operations in Migrated S3 Data in the Database Migration Service User Guide..
-
cdcInsertsOnly
A value that enables a change data capture (CDC) load to write only INSERT operations to .csv or columnar storage (.parquet) output files. By default (the
false
setting), the first field in a .csv or .parquet record contains the letter I (INSERT), U (UPDATE), or D (DELETE). These values indicate whether the row was inserted, updated, or deleted at the source database for a CDC load to the target.If
CdcInsertsOnly
is set totrue
ory
, only INSERTs from the source database are migrated to the .csv or .parquet file. For .csv format only, how these INSERTs are recorded depends on the value ofIncludeOpForFullLoad
. IfIncludeOpForFullLoad
is set totrue
, the first field of every CDC record is set to I to indicate the INSERT operation at the source. IfIncludeOpForFullLoad
is set tofalse
, every CDC record is written without a first field to indicate the INSERT operation at the source. For more information about how these settings work together, see Indicating Source DB Operations in Migrated S3 Data in the Database Migration Service User Guide..DMS supports the interaction described preceding between the
CdcInsertsOnly
andIncludeOpForFullLoad
parameters in versions 3.1.4 and later.CdcInsertsOnly
andCdcInsertsAndUpdates
can't both be set totrue
for the same endpoint. Set eitherCdcInsertsOnly
orCdcInsertsAndUpdates
totrue
for the same endpoint, but not both.- Returns:
- A value that enables a change data capture (CDC) load to write only INSERT operations to .csv or columnar
storage (.parquet) output files. By default (the
false
setting), the first field in a .csv or .parquet record contains the letter I (INSERT), U (UPDATE), or D (DELETE). These values indicate whether the row was inserted, updated, or deleted at the source database for a CDC load to the target.If
CdcInsertsOnly
is set totrue
ory
, only INSERTs from the source database are migrated to the .csv or .parquet file. For .csv format only, how these INSERTs are recorded depends on the value ofIncludeOpForFullLoad
. IfIncludeOpForFullLoad
is set totrue
, the first field of every CDC record is set to I to indicate the INSERT operation at the source. IfIncludeOpForFullLoad
is set tofalse
, every CDC record is written without a first field to indicate the INSERT operation at the source. For more information about how these settings work together, see Indicating Source DB Operations in Migrated S3 Data in the Database Migration Service User Guide..DMS supports the interaction described preceding between the
CdcInsertsOnly
andIncludeOpForFullLoad
parameters in versions 3.1.4 and later.CdcInsertsOnly
andCdcInsertsAndUpdates
can't both be set totrue
for the same endpoint. Set eitherCdcInsertsOnly
orCdcInsertsAndUpdates
totrue
for the same endpoint, but not both.
-
timestampColumnName
A value that when nonblank causes DMS to add a column with timestamp information to the endpoint data for an Amazon S3 target.
DMS supports the
TimestampColumnName
parameter in versions 3.1.4 and later.DMS includes an additional
STRING
column in the .csv or .parquet object files of your migrated data when you setTimestampColumnName
to a nonblank value.For a full load, each row of this timestamp column contains a timestamp for when the data was transferred from the source to the target by DMS.
For a change data capture (CDC) load, each row of the timestamp column contains the timestamp for the commit of that row in the source database.
The string format for this timestamp column value is
yyyy-MM-dd HH:mm:ss.SSSSSS
. By default, the precision of this value is in microseconds. For a CDC load, the rounding of the precision depends on the commit timestamp supported by DMS for the source database.When the
AddColumnName
parameter is set totrue
, DMS also includes a name for the timestamp column that you set withTimestampColumnName
.- Returns:
- A value that when nonblank causes DMS to add a column with timestamp information to the endpoint data for
an Amazon S3 target.
DMS supports the
TimestampColumnName
parameter in versions 3.1.4 and later.DMS includes an additional
STRING
column in the .csv or .parquet object files of your migrated data when you setTimestampColumnName
to a nonblank value.For a full load, each row of this timestamp column contains a timestamp for when the data was transferred from the source to the target by DMS.
For a change data capture (CDC) load, each row of the timestamp column contains the timestamp for the commit of that row in the source database.
The string format for this timestamp column value is
yyyy-MM-dd HH:mm:ss.SSSSSS
. By default, the precision of this value is in microseconds. For a CDC load, the rounding of the precision depends on the commit timestamp supported by DMS for the source database.When the
AddColumnName
parameter is set totrue
, DMS also includes a name for the timestamp column that you set withTimestampColumnName
.
-
parquetTimestampInMillisecond
A value that specifies the precision of any
TIMESTAMP
column values that are written to an Amazon S3 object file in .parquet format.DMS supports the
ParquetTimestampInMillisecond
parameter in versions 3.1.4 and later.When
ParquetTimestampInMillisecond
is set totrue
ory
, DMS writes allTIMESTAMP
columns in a .parquet formatted file with millisecond precision. Otherwise, DMS writes them with microsecond precision.Currently, Amazon Athena and Glue can handle only millisecond precision for
TIMESTAMP
values. Set this parameter totrue
for S3 endpoint object files that are .parquet formatted only if you plan to query or process the data with Athena or Glue.DMS writes any
TIMESTAMP
column values written to an S3 file in .csv format with microsecond precision.Setting
ParquetTimestampInMillisecond
has no effect on the string format of the timestamp column value that is inserted by setting theTimestampColumnName
parameter.- Returns:
- A value that specifies the precision of any
TIMESTAMP
column values that are written to an Amazon S3 object file in .parquet format.DMS supports the
ParquetTimestampInMillisecond
parameter in versions 3.1.4 and later.When
ParquetTimestampInMillisecond
is set totrue
ory
, DMS writes allTIMESTAMP
columns in a .parquet formatted file with millisecond precision. Otherwise, DMS writes them with microsecond precision.Currently, Amazon Athena and Glue can handle only millisecond precision for
TIMESTAMP
values. Set this parameter totrue
for S3 endpoint object files that are .parquet formatted only if you plan to query or process the data with Athena or Glue.DMS writes any
TIMESTAMP
column values written to an S3 file in .csv format with microsecond precision.Setting
ParquetTimestampInMillisecond
has no effect on the string format of the timestamp column value that is inserted by setting theTimestampColumnName
parameter.
-
cdcInsertsAndUpdates
A value that enables a change data capture (CDC) load to write INSERT and UPDATE operations to .csv or .parquet (columnar storage) output files. The default setting is
false
, but whenCdcInsertsAndUpdates
is set totrue
ory
, only INSERTs and UPDATEs from the source database are migrated to the .csv or .parquet file.DMS supports the use of the .parquet files in versions 3.4.7 and later.
How these INSERTs and UPDATEs are recorded depends on the value of the
IncludeOpForFullLoad
parameter. IfIncludeOpForFullLoad
is set totrue
, the first field of every CDC record is set to eitherI
orU
to indicate INSERT and UPDATE operations at the source. But ifIncludeOpForFullLoad
is set tofalse
, CDC records are written without an indication of INSERT or UPDATE operations at the source. For more information about how these settings work together, see Indicating Source DB Operations in Migrated S3 Data in the Database Migration Service User Guide..DMS supports the use of the
CdcInsertsAndUpdates
parameter in versions 3.3.1 and later.CdcInsertsOnly
andCdcInsertsAndUpdates
can't both be set totrue
for the same endpoint. Set eitherCdcInsertsOnly
orCdcInsertsAndUpdates
totrue
for the same endpoint, but not both.- Returns:
- A value that enables a change data capture (CDC) load to write INSERT and UPDATE operations to .csv or
.parquet (columnar storage) output files. The default setting is
false
, but whenCdcInsertsAndUpdates
is set totrue
ory
, only INSERTs and UPDATEs from the source database are migrated to the .csv or .parquet file.DMS supports the use of the .parquet files in versions 3.4.7 and later.
How these INSERTs and UPDATEs are recorded depends on the value of the
IncludeOpForFullLoad
parameter. IfIncludeOpForFullLoad
is set totrue
, the first field of every CDC record is set to eitherI
orU
to indicate INSERT and UPDATE operations at the source. But ifIncludeOpForFullLoad
is set tofalse
, CDC records are written without an indication of INSERT or UPDATE operations at the source. For more information about how these settings work together, see Indicating Source DB Operations in Migrated S3 Data in the Database Migration Service User Guide..DMS supports the use of the
CdcInsertsAndUpdates
parameter in versions 3.3.1 and later.CdcInsertsOnly
andCdcInsertsAndUpdates
can't both be set totrue
for the same endpoint. Set eitherCdcInsertsOnly
orCdcInsertsAndUpdates
totrue
for the same endpoint, but not both.
-
datePartitionEnabled
When set to
true
, this parameter partitions S3 bucket folders based on transaction commit dates. The default value isfalse
. For more information about date-based folder partitioning, see Using date-based folder partitioning.- Returns:
- When set to
true
, this parameter partitions S3 bucket folders based on transaction commit dates. The default value isfalse
. For more information about date-based folder partitioning, see Using date-based folder partitioning.
-
datePartitionSequence
Identifies the sequence of the date format to use during folder partitioning. The default value is
YYYYMMDD
. Use this parameter whenDatePartitionedEnabled
is set totrue
.If the service returns an enum value that is not available in the current SDK version,
datePartitionSequence
will returnDatePartitionSequenceValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromdatePartitionSequenceAsString()
.- Returns:
- Identifies the sequence of the date format to use during folder partitioning. The default value is
YYYYMMDD
. Use this parameter whenDatePartitionedEnabled
is set totrue
. - See Also:
-
datePartitionSequenceAsString
Identifies the sequence of the date format to use during folder partitioning. The default value is
YYYYMMDD
. Use this parameter whenDatePartitionedEnabled
is set totrue
.If the service returns an enum value that is not available in the current SDK version,
datePartitionSequence
will returnDatePartitionSequenceValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromdatePartitionSequenceAsString()
.- Returns:
- Identifies the sequence of the date format to use during folder partitioning. The default value is
YYYYMMDD
. Use this parameter whenDatePartitionedEnabled
is set totrue
. - See Also:
-
datePartitionDelimiter
Specifies a date separating delimiter to use during folder partitioning. The default value is
SLASH
. Use this parameter whenDatePartitionedEnabled
is set totrue
.If the service returns an enum value that is not available in the current SDK version,
datePartitionDelimiter
will returnDatePartitionDelimiterValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromdatePartitionDelimiterAsString()
.- Returns:
- Specifies a date separating delimiter to use during folder partitioning. The default value is
SLASH
. Use this parameter whenDatePartitionedEnabled
is set totrue
. - See Also:
-
datePartitionDelimiterAsString
Specifies a date separating delimiter to use during folder partitioning. The default value is
SLASH
. Use this parameter whenDatePartitionedEnabled
is set totrue
.If the service returns an enum value that is not available in the current SDK version,
datePartitionDelimiter
will returnDatePartitionDelimiterValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromdatePartitionDelimiterAsString()
.- Returns:
- Specifies a date separating delimiter to use during folder partitioning. The default value is
SLASH
. Use this parameter whenDatePartitionedEnabled
is set totrue
. - See Also:
-
useCsvNoSupValue
This setting applies if the S3 output files during a change data capture (CDC) load are written in .csv format. If set to
true
for columns not included in the supplemental log, DMS uses the value specified byCsvNoSupValue
. If not set or set tofalse
, DMS uses the null value for these columns.This setting is supported in DMS versions 3.4.1 and later.
- Returns:
- This setting applies if the S3 output files during a change data capture (CDC) load are written in .csv
format. If set to
true
for columns not included in the supplemental log, DMS uses the value specified byCsvNoSupValue
. If not set or set tofalse
, DMS uses the null value for these columns.This setting is supported in DMS versions 3.4.1 and later.
-
csvNoSupValue
This setting only applies if your Amazon S3 output files during a change data capture (CDC) load are written in .csv format. If
UseCsvNoSupValue
is set to true, specify a string value that you want DMS to use for all columns not included in the supplemental log. If you do not specify a string value, DMS uses the null value for these columns regardless of theUseCsvNoSupValue
setting.This setting is supported in DMS versions 3.4.1 and later.
- Returns:
- This setting only applies if your Amazon S3 output files during a change data capture (CDC) load are
written in .csv format. If
UseCsvNoSupValue
is set to true, specify a string value that you want DMS to use for all columns not included in the supplemental log. If you do not specify a string value, DMS uses the null value for these columns regardless of theUseCsvNoSupValue
setting.This setting is supported in DMS versions 3.4.1 and later.
-
preserveTransactions
If set to
true
, DMS saves the transaction order for a change data capture (CDC) load on the Amazon S3 target specified byCdcPath
. For more information, see Capturing data changes (CDC) including transaction order on the S3 target.This setting is supported in DMS versions 3.4.2 and later.
- Returns:
- If set to
true
, DMS saves the transaction order for a change data capture (CDC) load on the Amazon S3 target specified byCdcPath
. For more information, see Capturing data changes (CDC) including transaction order on the S3 target.This setting is supported in DMS versions 3.4.2 and later.
-
cdcPath
Specifies the folder path of CDC files. For an S3 source, this setting is required if a task captures change data; otherwise, it's optional. If
CdcPath
is set, DMS reads CDC files from this path and replicates the data changes to the target endpoint. For an S3 target if you setPreserveTransactions
totrue
, DMS verifies that you have set this parameter to a folder path on your S3 target where DMS can save the transaction order for the CDC load. DMS creates this CDC folder path in either your S3 target working directory or the S3 target location specified byBucketFolder
andBucketName
.For example, if you specify
CdcPath
asMyChangedData
, and you specifyBucketName
asMyTargetBucket
but do not specifyBucketFolder
, DMS creates the CDC folder path following:MyTargetBucket/MyChangedData
.If you specify the same
CdcPath
, and you specifyBucketName
asMyTargetBucket
andBucketFolder
asMyTargetData
, DMS creates the CDC folder path following:MyTargetBucket/MyTargetData/MyChangedData
.For more information on CDC including transaction order on an S3 target, see Capturing data changes (CDC) including transaction order on the S3 target.
This setting is supported in DMS versions 3.4.2 and later.
- Returns:
- Specifies the folder path of CDC files. For an S3 source, this setting is required if a task captures
change data; otherwise, it's optional. If
CdcPath
is set, DMS reads CDC files from this path and replicates the data changes to the target endpoint. For an S3 target if you setPreserveTransactions
totrue
, DMS verifies that you have set this parameter to a folder path on your S3 target where DMS can save the transaction order for the CDC load. DMS creates this CDC folder path in either your S3 target working directory or the S3 target location specified byBucketFolder
andBucketName
.For example, if you specify
CdcPath
asMyChangedData
, and you specifyBucketName
asMyTargetBucket
but do not specifyBucketFolder
, DMS creates the CDC folder path following:MyTargetBucket/MyChangedData
.If you specify the same
CdcPath
, and you specifyBucketName
asMyTargetBucket
andBucketFolder
asMyTargetData
, DMS creates the CDC folder path following:MyTargetBucket/MyTargetData/MyChangedData
.For more information on CDC including transaction order on an S3 target, see Capturing data changes (CDC) including transaction order on the S3 target.
This setting is supported in DMS versions 3.4.2 and later.
-
useTaskStartTimeForFullLoadTimestamp
When set to true, this parameter uses the task start time as the timestamp column value instead of the time data is written to target. For full load, when
useTaskStartTimeForFullLoadTimestamp
is set totrue
, each row of the timestamp column contains the task start time. For CDC loads, each row of the timestamp column contains the transaction commit time.When
useTaskStartTimeForFullLoadTimestamp
is set tofalse
, the full load timestamp in the timestamp column increments with the time data arrives at the target.- Returns:
- When set to true, this parameter uses the task start time as the timestamp column value instead of the
time data is written to target. For full load, when
useTaskStartTimeForFullLoadTimestamp
is set totrue
, each row of the timestamp column contains the task start time. For CDC loads, each row of the timestamp column contains the transaction commit time.When
useTaskStartTimeForFullLoadTimestamp
is set tofalse
, the full load timestamp in the timestamp column increments with the time data arrives at the target.
-
cannedAclForObjects
A value that enables DMS to specify a predefined (canned) access control list for objects created in an Amazon S3 bucket as .csv or .parquet files. For more information about Amazon S3 canned ACLs, see Canned ACL in the Amazon S3 Developer Guide.
The default value is NONE. Valid values include NONE, PRIVATE, PUBLIC_READ, PUBLIC_READ_WRITE, AUTHENTICATED_READ, AWS_EXEC_READ, BUCKET_OWNER_READ, and BUCKET_OWNER_FULL_CONTROL.
If the service returns an enum value that is not available in the current SDK version,
cannedAclForObjects
will returnCannedAclForObjectsValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromcannedAclForObjectsAsString()
.- Returns:
- A value that enables DMS to specify a predefined (canned) access control list for objects created in an
Amazon S3 bucket as .csv or .parquet files. For more information about Amazon S3 canned ACLs, see Canned ACL in the
Amazon S3 Developer Guide.
The default value is NONE. Valid values include NONE, PRIVATE, PUBLIC_READ, PUBLIC_READ_WRITE, AUTHENTICATED_READ, AWS_EXEC_READ, BUCKET_OWNER_READ, and BUCKET_OWNER_FULL_CONTROL.
- See Also:
-
cannedAclForObjectsAsString
A value that enables DMS to specify a predefined (canned) access control list for objects created in an Amazon S3 bucket as .csv or .parquet files. For more information about Amazon S3 canned ACLs, see Canned ACL in the Amazon S3 Developer Guide.
The default value is NONE. Valid values include NONE, PRIVATE, PUBLIC_READ, PUBLIC_READ_WRITE, AUTHENTICATED_READ, AWS_EXEC_READ, BUCKET_OWNER_READ, and BUCKET_OWNER_FULL_CONTROL.
If the service returns an enum value that is not available in the current SDK version,
cannedAclForObjects
will returnCannedAclForObjectsValue.UNKNOWN_TO_SDK_VERSION
. The raw value returned by the service is available fromcannedAclForObjectsAsString()
.- Returns:
- A value that enables DMS to specify a predefined (canned) access control list for objects created in an
Amazon S3 bucket as .csv or .parquet files. For more information about Amazon S3 canned ACLs, see Canned ACL in the
Amazon S3 Developer Guide.
The default value is NONE. Valid values include NONE, PRIVATE, PUBLIC_READ, PUBLIC_READ_WRITE, AUTHENTICATED_READ, AWS_EXEC_READ, BUCKET_OWNER_READ, and BUCKET_OWNER_FULL_CONTROL.
- See Also:
-
addColumnName
An optional parameter that, when set to
true
ory
, you can use to add column name information to the .csv output file.The default value is
false
. Valid values aretrue
,false
,y
, andn
.- Returns:
- An optional parameter that, when set to
true
ory
, you can use to add column name information to the .csv output file.The default value is
false
. Valid values aretrue
,false
,y
, andn
.
-
cdcMaxBatchInterval
Maximum length of the interval, defined in seconds, after which to output a file to Amazon S3.
When
CdcMaxBatchInterval
andCdcMinFileSize
are both specified, the file write is triggered by whichever parameter condition is met first within an DMS CloudFormation template.The default value is 60 seconds.
- Returns:
- Maximum length of the interval, defined in seconds, after which to output a file to Amazon S3.
When
CdcMaxBatchInterval
andCdcMinFileSize
are both specified, the file write is triggered by whichever parameter condition is met first within an DMS CloudFormation template.The default value is 60 seconds.
-
cdcMinFileSize
Minimum file size, defined in kilobytes, to reach for a file output to Amazon S3.
When
CdcMinFileSize
andCdcMaxBatchInterval
are both specified, the file write is triggered by whichever parameter condition is met first within an DMS CloudFormation template.The default value is 32 MB.
- Returns:
- Minimum file size, defined in kilobytes, to reach for a file output to Amazon S3.
When
CdcMinFileSize
andCdcMaxBatchInterval
are both specified, the file write is triggered by whichever parameter condition is met first within an DMS CloudFormation template.The default value is 32 MB.
-
csvNullValue
An optional parameter that specifies how DMS treats null values. While handling the null value, you can use this parameter to pass a user-defined string as null when writing to the target. For example, when target columns are nullable, you can use this option to differentiate between the empty string value and the null value. So, if you set this parameter value to the empty string ("" or ''), DMS treats the empty string as the null value instead of
NULL
.The default value is
NULL
. Valid values include any valid string.- Returns:
- An optional parameter that specifies how DMS treats null values. While handling the null value, you can
use this parameter to pass a user-defined string as null when writing to the target. For example, when
target columns are nullable, you can use this option to differentiate between the empty string value and
the null value. So, if you set this parameter value to the empty string ("" or ''), DMS treats the empty
string as the null value instead of
NULL
.The default value is
NULL
. Valid values include any valid string.
-
ignoreHeaderRows
When this value is set to 1, DMS ignores the first row header in a .csv file. A value of 1 turns on the feature; a value of 0 turns off the feature.
The default is 0.
- Returns:
- When this value is set to 1, DMS ignores the first row header in a .csv file. A value of 1 turns on the
feature; a value of 0 turns off the feature.
The default is 0.
-
maxFileSize
A value that specifies the maximum size (in KB) of any .csv file to be created while migrating to an S3 target during full load.
The default value is 1,048,576 KB (1 GB). Valid values include 1 to 1,048,576.
- Returns:
- A value that specifies the maximum size (in KB) of any .csv file to be created while migrating to an S3
target during full load.
The default value is 1,048,576 KB (1 GB). Valid values include 1 to 1,048,576.
-
rfc4180
For an S3 source, when this value is set to
true
ory
, each leading double quotation mark has to be followed by an ending double quotation mark. This formatting complies with RFC 4180. When this value is set tofalse
orn
, string literals are copied to the target as is. In this case, a delimiter (row or column) signals the end of the field. Thus, you can't use a delimiter as part of the string, because it signals the end of the value.For an S3 target, an optional parameter used to set behavior to comply with RFC 4180 for data migrated to Amazon S3 using .csv file format only. When this value is set to
true
ory
using Amazon S3 as a target, if the data has quotation marks or newline characters in it, DMS encloses the entire column with an additional pair of double quotation marks ("). Every quotation mark within the data is repeated twice.The default value is
true
. Valid values includetrue
,false
,y
, andn
.- Returns:
- For an S3 source, when this value is set to
true
ory
, each leading double quotation mark has to be followed by an ending double quotation mark. This formatting complies with RFC 4180. When this value is set tofalse
orn
, string literals are copied to the target as is. In this case, a delimiter (row or column) signals the end of the field. Thus, you can't use a delimiter as part of the string, because it signals the end of the value.For an S3 target, an optional parameter used to set behavior to comply with RFC 4180 for data migrated to Amazon S3 using .csv file format only. When this value is set to
true
ory
using Amazon S3 as a target, if the data has quotation marks or newline characters in it, DMS encloses the entire column with an additional pair of double quotation marks ("). Every quotation mark within the data is repeated twice.The default value is
true
. Valid values includetrue
,false
,y
, andn
.
-
datePartitionTimezone
When creating an S3 target endpoint, set
DatePartitionTimezone
to convert the current UTC time into a specified time zone. The conversion occurs when a date partition folder is created and a CDC filename is generated. The time zone format is Area/Location. Use this parameter whenDatePartitionedEnabled
is set totrue
, as shown in the following example.s3-settings='{"DatePartitionEnabled": true, "DatePartitionSequence": "YYYYMMDDHH", "DatePartitionDelimiter": "SLASH", "DatePartitionTimezone":"Asia/Seoul", "BucketName": "dms-nattarat-test"}'
- Returns:
- When creating an S3 target endpoint, set
DatePartitionTimezone
to convert the current UTC time into a specified time zone. The conversion occurs when a date partition folder is created and a CDC filename is generated. The time zone format is Area/Location. Use this parameter whenDatePartitionedEnabled
is set totrue
, as shown in the following example.s3-settings='{"DatePartitionEnabled": true, "DatePartitionSequence": "YYYYMMDDHH", "DatePartitionDelimiter": "SLASH", "DatePartitionTimezone":"Asia/Seoul", "BucketName": "dms-nattarat-test"}'
-
addTrailingPaddingCharacter
Use the S3 target endpoint setting
AddTrailingPaddingCharacter
to add padding on string data. The default value isfalse
.- Returns:
- Use the S3 target endpoint setting
AddTrailingPaddingCharacter
to add padding on string data. The default value isfalse
.
-
expectedBucketOwner
To specify a bucket owner and prevent sniping, you can use the
ExpectedBucketOwner
endpoint setting.Example:
--s3-settings='{"ExpectedBucketOwner": "AWS_Account_ID"}'
When you make a request to test a connection or perform a migration, S3 checks the account ID of the bucket owner against the specified parameter.
- Returns:
- To specify a bucket owner and prevent sniping, you can use the
ExpectedBucketOwner
endpoint setting.Example:
--s3-settings='{"ExpectedBucketOwner": "AWS_Account_ID"}'
When you make a request to test a connection or perform a migration, S3 checks the account ID of the bucket owner against the specified parameter.
-
glueCatalogGeneration
When true, allows Glue to catalog your S3 bucket. Creating an Glue catalog lets you use Athena to query your data.
- Returns:
- When true, allows Glue to catalog your S3 bucket. Creating an Glue catalog lets you use Athena to query your data.
-
toBuilder
Description copied from interface:ToCopyableBuilder
Take this object and create a builder that contains all of the current property values of this object.- Specified by:
toBuilder
in interfaceToCopyableBuilder<S3Settings.Builder,
S3Settings> - Returns:
- a builder for type T
-
builder
-
serializableBuilderClass
-
hashCode
-
equals
-
equalsBySdkFields
Description copied from interface:SdkPojo
Indicates whether some other object is "equal to" this one by SDK fields. An SDK field is a modeled, non-inherited field in anSdkPojo
class, and is generated based on a service model.If an
SdkPojo
class does not have any inherited fields,equalsBySdkFields
andequals
are essentially the same.- Specified by:
equalsBySdkFields
in interfaceSdkPojo
- Parameters:
obj
- the object to be compared with- Returns:
- true if the other object equals to this object by sdk fields, false otherwise.
-
toString
-
getValueForField
-
sdkFields
-