textract/aws.sdk.kotlin.services.textract.model/Block

Block

A Block represents items that are recognized in a document within a group of pixels close to each other. The information returned in a Block object depends on the type of operation. In text detection for documents (for example DetectDocumentText), you get information about the detected words and lines of text. In text analysis (for example AnalyzeDocument), you can also get information about the fields, tables, and selection elements that are detected in the document.

An array of Block objects is returned by both synchronous and asynchronous operations. In synchronous operations, such as DetectDocumentText, the array of Block objects is the entire set of results. In asynchronous operations, such as GetDocumentAnalysis, the array is returned over one or more responses.

For more information, see How Amazon Textract Works.

Types

Builder

class Builder

Companion

object Companion

Properties

blockType

val blockType: BlockType?

The type of text item that's recognized. In operations for text detection, the following types are returned:

columnIndex

val columnIndex: Int?

The column in which a table cell appears. The first column position is 1. ColumnIndex isn't returned by DetectDocumentText and GetDocumentTextDetection.

columnSpan

val columnSpan: Int?

The number of columns that a table cell spans. ColumnSpan isn't returned by DetectDocumentText and GetDocumentTextDetection.

confidence

val confidence: Float?

The confidence score that Amazon Textract has in the accuracy of the recognized text and the accuracy of the geometry points around the recognized text.

entityTypes

val entityTypes: List<EntityType>?

The type of entity.

geometry

val geometry: Geometry?

The location of the recognized text on the image. It includes an axis-aligned, coarse bounding box that surrounds the text, and a finer-grain polygon for more accurate spatial information.

val id: String?

The identifier for the recognized text. The identifier is only unique for a single operation.

page

val page: Int?

The page on which a block was detected. Page is returned by synchronous and asynchronous operations. Page values greater than 1 are only returned for multipage documents that are in PDF or TIFF format. A scanned image (JPEG/PNG) provided to an asynchronous operation, even if it contains multiple document pages, is considered a single-page document. This means that for scanned images the value of Page is always 1.

query

val query: Query?

relationships

val relationships: List<Relationship>?

A list of relationship objects that describe how blocks are related to each other. For example, a LINE block object contains a CHILD relationship type with the WORD blocks that make up the line of text. There aren't Relationship objects in the list for relationships that don't exist, such as when the current block has no child blocks.

rowIndex

val rowIndex: Int?

The row in which a table cell is located. The first row position is 1. RowIndex isn't returned by DetectDocumentText and GetDocumentTextDetection.

rowSpan

val rowSpan: Int?

The number of rows that a table cell spans. RowSpan isn't returned by DetectDocumentText and GetDocumentTextDetection.

selectionStatus

val selectionStatus: SelectionStatus?

The selection status of a selection element, such as an option button or check box.

text

val text: String?

The word or line of text that's recognized by Amazon Textract.

textType

val textType: TextType?

The kind of text that Amazon Textract has detected. Can check for handwritten text and printed text.