Class SemanticChunkingConfiguration

java.lang.Object
software.amazon.awssdk.services.bedrockagent.model.SemanticChunkingConfiguration
All Implemented Interfaces:
Serializable, SdkPojo, ToCopyableBuilder<SemanticChunkingConfiguration.Builder,SemanticChunkingConfiguration>

@Generated("software.amazon.awssdk:codegen") public final class SemanticChunkingConfiguration extends Object implements SdkPojo, Serializable, ToCopyableBuilder<SemanticChunkingConfiguration.Builder,SemanticChunkingConfiguration>

Settings for semantic document chunking for a data source. Semantic chunking splits a document into into smaller documents based on groups of similar content derived from the text with natural language processing.

With semantic chunking, each sentence is compared to the next to determine how similar they are. You specify a threshold in the form of a percentile, where adjacent sentences that are less similar than that percentage of sentence pairs are divided into separate chunks. For example, if you set the threshold to 90, then the 10 percent of sentence pairs that are least similar are split. So if you have 101 sentences, 100 sentence pairs are compared, and the 10 with the least similarity are split, creating 11 chunks. These chunks are further split if they exceed the max token size.

You must also specify a buffer size, which determines whether sentences are compared in isolation, or within a moving context window that includes the previous and following sentence. For example, if you set the buffer size to 1, the embedding for sentence 10 is derived from sentences 9, 10, and 11 combined.

See Also:
  • Method Details

    • breakpointPercentileThreshold

      public final Integer breakpointPercentileThreshold()

      The dissimilarity threshold for splitting chunks.

      Returns:
      The dissimilarity threshold for splitting chunks.
    • bufferSize

      public final Integer bufferSize()

      The buffer size.

      Returns:
      The buffer size.
    • maxTokens

      public final Integer maxTokens()

      The maximum number of tokens that a chunk can contain.

      Returns:
      The maximum number of tokens that a chunk can contain.
    • toBuilder

      Description copied from interface: ToCopyableBuilder
      Take this object and create a builder that contains all of the current property values of this object.
      Specified by:
      toBuilder in interface ToCopyableBuilder<SemanticChunkingConfiguration.Builder,SemanticChunkingConfiguration>
      Returns:
      a builder for type T
    • builder

      public static SemanticChunkingConfiguration.Builder builder()
    • serializableBuilderClass

      public static Class<? extends SemanticChunkingConfiguration.Builder> serializableBuilderClass()
    • hashCode

      public final int hashCode()
      Overrides:
      hashCode in class Object
    • equals

      public final boolean equals(Object obj)
      Overrides:
      equals in class Object
    • equalsBySdkFields

      public final boolean equalsBySdkFields(Object obj)
      Description copied from interface: SdkPojo
      Indicates whether some other object is "equal to" this one by SDK fields. An SDK field is a modeled, non-inherited field in an SdkPojo class, and is generated based on a service model.

      If an SdkPojo class does not have any inherited fields, equalsBySdkFields and equals are essentially the same.

      Specified by:
      equalsBySdkFields in interface SdkPojo
      Parameters:
      obj - the object to be compared with
      Returns:
      true if the other object equals to this object by sdk fields, false otherwise.
    • toString

      public final String toString()
      Returns a string representation of this object. This is useful for testing and debugging. Sensitive data will be redacted from this string using a placeholder value.
      Overrides:
      toString in class Object
    • getValueForField

      public final <T> Optional<T> getValueForField(String fieldName, Class<T> clazz)
    • sdkFields

      public final List<SdkField<?>> sdkFields()
      Specified by:
      sdkFields in interface SdkPojo
      Returns:
      List of SdkField in this POJO. May be empty list but should never be null.
    • sdkFieldNameToField

      public final Map<String,SdkField<?>> sdkFieldNameToField()
      Specified by:
      sdkFieldNameToField in interface SdkPojo
      Returns:
      The mapping between the field name and its corresponding field.