bedrock/aws.sdk.kotlin.services.bedrock.model/HumanEvaluationConfig

HumanEvaluationConfig

Specifies the custom metrics, how tasks will be rated, the flow definition ARN, and your custom prompt datasets. Model evaluation jobs use human workers only support the use of custom prompt datasets. To learn more about custom prompt datasets and the required format, see Custom prompt datasets.

When you create custom metrics in HumanEvaluationCustomMetric you must specify the metric's name. The list of names specified in the HumanEvaluationCustomMetric array, must match the metricNames array of strings specified in EvaluationDatasetMetricConfig. For example, if in the HumanEvaluationCustomMetric array your specified the names "accuracy", "toxicity", "readability" as custom metrics then the metricNames array would need to look like the following ["accuracy", "toxicity", "readability"] in EvaluationDatasetMetricConfig.