By default, PDL queries are run on the column selected in the Index node. However, users can select a different text column to search in. In order to search in a different column, one should explicitly indicate the column name after the query using @ColumnName syntax.

This syntax applies to text columns only, for information on how to search in non-text columns, please see Using SRL language in PDL queries section.

Column names are case sensitive and must be enclosed in quotation marks if they contain spaces, otherwise quotation marks are optional.

Example

"check engine"@Description = "check engine"@[Description] is a correct query and matches the documents from the Description column if they contain the phrase "check engine". It should be noted, that in this case the brackets are optional as the column name does not contain spaces.

"check engine"@[Customer comments] is a correct query and matches the documents from the Customer comments column if they contain the phrase "check engine".

"check engine"@Customer comments is an incorrect query because the column name contains spaces and thus must be enclosed in brackets.

Task example: Car complaint analysis

Consider the task of analyzing car complaints. Let us imagine that there is a dataset with two columns (Customer comments and Technician comments) and "Customer comments" column is selected as indexing column:

pdl multicol 1

By default, pdl queries are run on "Customer comments" because it was selected as indexing column. In order to search in the "Technician comments" column, one should explicitly identify the column name using @ColumnName syntax.

For example, in order to match documents where technicians mentioned road tests, one can use the following query:

pdl multicol 2

If the query consists of several arguments joined by OR, AND, NOT, AND NOT or XOR operators, each query operand can be set to search in a specific column. Multicolumn queries are used to specify multiple search conditions for different columns in a single query.

Task example: Car complaint analysis

Let us consider again the dataset from the previous example. In order to search for door or window problems caused by a failed actuator, one has to use a multicolumn query:

pdl multicol 4

The query consists of two operands:

  1. "door or window" is run on the default search column, i.e. "Customer comments" column

  2. "near(3, actuator, fail or stuck or problem or "not work")" is run on the "Technician comments" column

The query matches the documents which have the words door or window in "Customer comments" as well as the phrases like "failed actuator" or "actuator problem" in the "Technicians comments" column.