Class ParquetSerDe
- All Implemented Interfaces:
Serializable,SdkPojo,ToCopyableBuilder<ParquetSerDe.Builder,ParquetSerDe>
A serializer to use for converting data to the Parquet format before storing it in Amazon S3. For more information, see Apache Parquet.
- See Also:
-
Nested Class Summary
Nested Classes -
Method Summary
Modifier and TypeMethodDescriptionfinal IntegerThe Hadoop Distributed File System (HDFS) block size.static ParquetSerDe.Builderbuilder()final ParquetCompressionThe compression code to use over data blocks.final StringThe compression code to use over data blocks.final BooleanIndicates whether to enable dictionary compression.final booleanfinal booleanequalsBySdkFields(Object obj) Indicates whether some other object is "equal to" this one by SDK fields.final <T> Optional<T> getValueForField(String fieldName, Class<T> clazz) final inthashCode()final IntegerThe maximum amount of padding to apply.final IntegerThe Parquet page size.static Class<? extends ParquetSerDe.Builder> Take this object and create a builder that contains all of the current property values of this object.final StringtoString()Returns a string representation of this object.final ParquetWriterVersionIndicates the version of row format to output.final StringIndicates the version of row format to output.Methods inherited from interface software.amazon.awssdk.utils.builder.ToCopyableBuilder
copy
-
Method Details
-
blockSizeBytes
The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Firehose uses this value for padding calculations.
- Returns:
- The Hadoop Distributed File System (HDFS) block size. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 256 MiB and the minimum is 64 MiB. Firehose uses this value for padding calculations.
-
pageSizeBytes
The Parquet page size. Column chunks are divided into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). The minimum value is 64 KiB and the default is 1 MiB.
- Returns:
- The Parquet page size. Column chunks are divided into pages. A page is conceptually an indivisible unit (in terms of compression and encoding). The minimum value is 64 KiB and the default is 1 MiB.
-
compression
The compression code to use over data blocks. The possible values are
UNCOMPRESSED,SNAPPY, andGZIP, with the default beingSNAPPY. UseSNAPPYfor higher decompression speed. UseGZIPif the compression ratio is more important than speed.If the service returns an enum value that is not available in the current SDK version,
compressionwill returnParquetCompression.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available fromcompressionAsString().- Returns:
- The compression code to use over data blocks. The possible values are
UNCOMPRESSED,SNAPPY, andGZIP, with the default beingSNAPPY. UseSNAPPYfor higher decompression speed. UseGZIPif the compression ratio is more important than speed. - See Also:
-
compressionAsString
The compression code to use over data blocks. The possible values are
UNCOMPRESSED,SNAPPY, andGZIP, with the default beingSNAPPY. UseSNAPPYfor higher decompression speed. UseGZIPif the compression ratio is more important than speed.If the service returns an enum value that is not available in the current SDK version,
compressionwill returnParquetCompression.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available fromcompressionAsString().- Returns:
- The compression code to use over data blocks. The possible values are
UNCOMPRESSED,SNAPPY, andGZIP, with the default beingSNAPPY. UseSNAPPYfor higher decompression speed. UseGZIPif the compression ratio is more important than speed. - See Also:
-
enableDictionaryCompression
Indicates whether to enable dictionary compression.
- Returns:
- Indicates whether to enable dictionary compression.
-
maxPaddingBytes
The maximum amount of padding to apply. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 0.
- Returns:
- The maximum amount of padding to apply. This is useful if you intend to copy the data from Amazon S3 to HDFS before querying. The default is 0.
-
writerVersion
Indicates the version of row format to output. The possible values are
V1andV2. The default isV1.If the service returns an enum value that is not available in the current SDK version,
writerVersionwill returnParquetWriterVersion.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available fromwriterVersionAsString().- Returns:
- Indicates the version of row format to output. The possible values are
V1andV2. The default isV1. - See Also:
-
writerVersionAsString
Indicates the version of row format to output. The possible values are
V1andV2. The default isV1.If the service returns an enum value that is not available in the current SDK version,
writerVersionwill returnParquetWriterVersion.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available fromwriterVersionAsString().- Returns:
- Indicates the version of row format to output. The possible values are
V1andV2. The default isV1. - See Also:
-
toBuilder
Description copied from interface:ToCopyableBuilderTake this object and create a builder that contains all of the current property values of this object.- Specified by:
toBuilderin interfaceToCopyableBuilder<ParquetSerDe.Builder,ParquetSerDe> - Returns:
- a builder for type T
-
builder
-
serializableBuilderClass
-
hashCode
-
equals
-
equalsBySdkFields
Description copied from interface:SdkPojoIndicates whether some other object is "equal to" this one by SDK fields. An SDK field is a modeled, non-inherited field in anSdkPojoclass, and is generated based on a service model.If an
SdkPojoclass does not have any inherited fields,equalsBySdkFieldsandequalsare essentially the same.- Specified by:
equalsBySdkFieldsin interfaceSdkPojo- Parameters:
obj- the object to be compared with- Returns:
- true if the other object equals to this object by sdk fields, false otherwise.
-
toString
-
getValueForField
-
sdkFields
-
sdkFieldNameToField
- Specified by:
sdkFieldNameToFieldin interfaceSdkPojo- Returns:
- The mapping between the field name and its corresponding field.
-