The metrics for an existing endpoint compared in an Inference Recommender job.
The expected maximum number of requests per minute for the instance.
The expected model latency at maximum invocations per minute for the instance.