docpart
Purpose
Finds text records that contain arguments in the document’s section specified by the first parameter of the function. If the arguments are omitted, the function matches all sections of the specified type.
Arguments
The list of section_type parameter’s values and optional names arguments below is exhaustive.
The function also takes the following optional parameters:
-
* ocr is used to find documents containing words that were recognized by the PolyAnalyst OCR module with a high recognition confidence score.
-
confidence sets the confidence range of OCR recognition.
-
-
rotated/unrotated search for rotated/unrotated text (set to "unrotated" by default).
-
degree sets the degree of rotation (i.e., 15, 16, 30.5, 45, 90, etc.);
-
type (horizontal/vertical) sets vertical/horizontal type of rotation;
-
scope (token/sentence/paragraph/text) specifies whether to output results by tokens, sentences, paragraphs, or the entire text (scope:=text by default). The parameter works if there are no nested arguments.
-
Note
-
The parameter page supports docx and pdf formats.
-
If users wish to search within several sections, they may list them with "|" symbol.
-
If the attributes are omitted, the function matches all sections of the specified type.
-
One can use the relational operators ">", "<", ">=", "<=", "!=" to specify a search within numerical parameters, e.g. docpart(table, col:>1, col:<3, row:>1)
-
The docpart function matches the intersection of the query with table sections or pages set by the number argument. Therefore, the query can only partially reside in the specified table sections or on the specified pages.
-
The optional named attribute number of the page parameter can take a negative value. In this case, it is counted from the last page in the document, i.e. number:"-1" limits the query to the last page, number:>="-2" limits the query to the last two pages.
-
The hyperlink parameter finds hyperlinks only in html-pages. In order to use the parameter, it is necessary to connect the node to an already executed parent node Internet source.
-
Search for rotated text is possible in .docx documents and in documents recognized by OCR. Document rotation is considered in two directions: clockwise (positive value) and counterclockwise (negative value). The rotation value is set in the range [-180; 180] degrees.