blockType
The type of text item that's recognized. In operations for text detection, the following types are returned:
PAGE - Contains a list of the LINE
Block
objects that are detected on a document page.WORD - A word detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
In text analysis operations, the following types are returned:
PAGE - Contains a list of child
Block
objects that are detected on a document page.KEY_VALUE_SET - Stores the KEY and VALUE
Block
objects for linked text that's detected on a document page. Use theEntityType
field to determine if a KEY_VALUE_SET object is a KEYBlock
object or a VALUEBlock
object.WORD - A word that's detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.
LINE - A string of tab-delimited, contiguous words that are detected on a document page.
TABLE - A table that's detected on a document page. A table is grid-based information with two or more rows or columns, with a cell span of one row and one column each.
TABLE_TITLE - The title of a table. A title is typically a line of text above or below a table, or embedded as the first row of a table.
TABLE_FOOTER - The footer associated with a table. A footer is typically a line or lines of text below a table or embedded as the last row of a table.
CELL - A cell within a detected table. The cell is the parent of the block that contains the text in the cell.
MERGED_CELL - A cell in a table whose content spans more than one row or column. The
Relationships
array for this cell contain data from individual cells.SELECTION_ELEMENT - A selection element such as an option button (radio button) or a check box that's detected on a document page. Use the value of
SelectionStatus
to determine the status of the selection element.SIGNATURE - The location and confidence score of a signature detected on a document page. Can be returned as part of a Key-Value pair or a detected cell.
QUERY - A question asked during the call of AnalyzeDocument. Contains an alias and an ID that attaches it to its answer.
QUERY_RESULT - A response to a question asked during the call of analyze document. Comes with an alias and ID for ease of locating in a response. Also contains location and confidence score.
The following BlockTypes are only returned for Amazon Textract Layout.
LAYOUT_TITLE
- The main title of the document.LAYOUT_HEADER
- Text located in the top margin of the document.LAYOUT_FOOTER
- Text located in the bottom margin of the document.LAYOUT_SECTION_HEADER
- The titles of sections within a document.LAYOUT_PAGE_NUMBER
- The page number of the documents.LAYOUT_LIST
- Any information grouped together in list form.LAYOUT_FIGURE
- Indicates the location of an image in a document.LAYOUT_TABLE
- Indicates the location of a table in the document.LAYOUT_KEY_VALUE
- Indicates the location of form key-values in a document.LAYOUT_TEXT
- Text that is present typically as a part of paragraphs in documents.