DataProcessing
The data structure used to specify the data to be used for inference in a batch transform job and to associate the data that is relevant to the prediction results in the output. The input filter provided allows you to exclude input data that is not needed for inference in a batch transform job. The output filter provided allows you to include input data relevant to interpreting the predictions in the output from the job. For more information, see Associate Prediction Results with their Corresponding Input Records.
Types
Properties
A JSONPath expression used to select a portion of the input data to pass to the algorithm. Use the InputFilter
parameter to exclude fields, such as an ID column, from the input. If you want SageMaker to pass the entire input dataset to the algorithm, accept the default value $
.
Specifies the source of the data to join with the transformed data. The valid values are None
and Input
. The default value is None
, which specifies not to join the input with the transformed data. If you want the batch transform job to join the original input data with the transformed data, set JoinSource
to Input
. You can specify OutputFilter
as an additional filter to select a portion of the joined dataset and store it in the output file.
A JSONPath expression used to select a portion of the joined dataset to save in the output file for a batch transform job. If you want SageMaker to store the entire input dataset in the output file, leave the default value, $
. If you specify indexes that aren't within the dimension size of the joined dataset, you get an error.