Interface ProductionVariantManagedInstanceScalingScaleInPolicy.Builder
- All Superinterfaces:
Buildable,CopyableBuilder<ProductionVariantManagedInstanceScalingScaleInPolicy.Builder,,ProductionVariantManagedInstanceScalingScaleInPolicy> SdkBuilder<ProductionVariantManagedInstanceScalingScaleInPolicy.Builder,,ProductionVariantManagedInstanceScalingScaleInPolicy> SdkPojo
- Enclosing class:
ProductionVariantManagedInstanceScalingScaleInPolicy
-
Method Summary
Modifier and TypeMethodDescriptioncooldownInMinutes(Integer cooldownInMinutes) The cooldown period, in minutes, after the last endpoint operation before the endpoint evaluates consolidation scale-in opportunities.maximumStepSize(Integer maximumStepSize) The maximum number of instances that the endpoint can terminate at a time during a consolidation scale-in operation.The strategy for scaling in instances.The strategy for scaling in instances.Methods inherited from interface software.amazon.awssdk.utils.builder.CopyableBuilder
copyMethods inherited from interface software.amazon.awssdk.utils.builder.SdkBuilder
applyMutation, buildMethods inherited from interface software.amazon.awssdk.core.SdkPojo
equalsBySdkFields, sdkFieldNameToField, sdkFields
-
Method Details
-
strategy
The strategy for scaling in instances.
- IDLE_RELEASE
-
Releases instances that have no hosted inference component copies.
- CONSOLIDATION
-
Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.
- Parameters:
strategy- The strategy for scaling in instances.- IDLE_RELEASE
-
Releases instances that have no hosted inference component copies.
- CONSOLIDATION
-
Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.
- Returns:
- Returns a reference to this object so that method calls can be chained together.
- See Also:
-
strategy
ProductionVariantManagedInstanceScalingScaleInPolicy.Builder strategy(ManagedInstanceScalingScaleInStrategy strategy) The strategy for scaling in instances.
- IDLE_RELEASE
-
Releases instances that have no hosted inference component copies.
- CONSOLIDATION
-
Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.
- Parameters:
strategy- The strategy for scaling in instances.- IDLE_RELEASE
-
Releases instances that have no hosted inference component copies.
- CONSOLIDATION
-
Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.
- Returns:
- Returns a reference to this object so that method calls can be chained together.
- See Also:
-
maximumStepSize
ProductionVariantManagedInstanceScalingScaleInPolicy.Builder maximumStepSize(Integer maximumStepSize) The maximum number of instances that the endpoint can terminate at a time during a consolidation scale-in operation.
Default value:
1.- Parameters:
maximumStepSize- The maximum number of instances that the endpoint can terminate at a time during a consolidation scale-in operation.Default value:
1.- Returns:
- Returns a reference to this object so that method calls can be chained together.
-
cooldownInMinutes
ProductionVariantManagedInstanceScalingScaleInPolicy.Builder cooldownInMinutes(Integer cooldownInMinutes) The cooldown period, in minutes, after the last endpoint operation before the endpoint evaluates consolidation scale-in opportunities.
Default value:
20.- Parameters:
cooldownInMinutes- The cooldown period, in minutes, after the last endpoint operation before the endpoint evaluates consolidation scale-in opportunities.Default value:
20.- Returns:
- Returns a reference to this object so that method calls can be chained together.
-