Interface ProductionVariantManagedInstanceScalingScaleInPolicy.Builder

  • Method Details

    • strategy

      The strategy for scaling in instances.

      IDLE_RELEASE

      Releases instances that have no hosted inference component copies.

      CONSOLIDATION

      Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.

      Parameters:
      strategy - The strategy for scaling in instances.

      IDLE_RELEASE

      Releases instances that have no hosted inference component copies.

      CONSOLIDATION

      Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.

      Returns:
      Returns a reference to this object so that method calls can be chained together.
      See Also:
    • strategy

      The strategy for scaling in instances.

      IDLE_RELEASE

      Releases instances that have no hosted inference component copies.

      CONSOLIDATION

      Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.

      Parameters:
      strategy - The strategy for scaling in instances.

      IDLE_RELEASE

      Releases instances that have no hosted inference component copies.

      CONSOLIDATION

      Consolidates inference component copies onto fewer instances to release more instances. Consolidation honors the scheduling configuration of each inference component. For example, if an inference component specifies Availability Zone balance, consolidation only proceeds when the resulting distribution does not increase the imbalance.

      Returns:
      Returns a reference to this object so that method calls can be chained together.
      See Also:
    • maximumStepSize

      The maximum number of instances that the endpoint can terminate at a time during a consolidation scale-in operation.

      Default value: 1.

      Parameters:
      maximumStepSize - The maximum number of instances that the endpoint can terminate at a time during a consolidation scale-in operation.

      Default value: 1.

      Returns:
      Returns a reference to this object so that method calls can be chained together.
    • cooldownInMinutes

      The cooldown period, in minutes, after the last endpoint operation before the endpoint evaluates consolidation scale-in opportunities.

      Default value: 20.

      Parameters:
      cooldownInMinutes - The cooldown period, in minutes, after the last endpoint operation before the endpoint evaluates consolidation scale-in opportunities.

      Default value: 20.

      Returns:
      Returns a reference to this object so that method calls can be chained together.