intersect

Purpose

Finds documents that contain occurrences of the first argument which intersect with occurrences of the second argument.

Syntax

intersect(term_1, term_2)

Arguments

The function takes two required arguments.

The function also supports the optional named parameter match which takes one of the following values:

Value

Explanation

first

the first argument entirely (default value)

second

the second argument entirely

intersection

arguments’ intersection

union

arguments’ union

difference

arguments’ symmetric difference, i.e. items which are in either of the sets, but not in their intersection

difference_first

items which are in the left, but not in the right set

difference_second

items which are in the right, but not in the left set

Moreover, the function supports the optional named parameter diff that sets a limitation of the possible difference between arguments in words. Thus, diff is a difference between the arguments' intersection and the quantity of words they have in common. It is possible to indicate two diff parameters defining upper and lower bound of the allowed difference.

In order to limit a search, one can use the operators :=, :>, :>=, :<, :<=.

Returned Value

Documents matching the query.

Examples

intersect("a b b", "b", match:=first) matches "a b b" in "a b b".

intersect("a b b", "b", match:=second) matches "b b" in "a b b".

intersect("a b b", "b", match:=difference) matches "a" in "a b b", i.e. all second arguments are subtracted from the first one.

intersect("b", "a b b", match:=intersection) matches "b" and "b" as two arguments (the first argument intersects the second twice).

intersect("a b b", "b", match:=union) matches "a b b" in "a b b".

intersect("a b b", "b", match:=difference_first) matches "a" in "a b b".

intersect("a b b", "b c", match:=difference_second) matches "c" in a b b c.

intersect(entity(Organizations), "police department", diff:=0) matches organizations equal (including punctuation) to the phrase "police department".

intersect(entity(Organizations), "police department", diff:<=1) matches organizations that differ from "police department" by no more than one word, e.g. "Metropolitan Police Department".

intersect(entity(Organizations), "police department", diff:>=1, diff:<=2) matches organizations that differ from "police department" by one or two words, e.g. "Fairfax County Police Department".

Note
  • If positions of the second argument intersect the first argument, they will be matched entirely. For example, intersect("a b b", "b", match:=intersection) matches "b b" in the phrase "a b b".

  • The function is an alias for "term_1&term_2", however this notation works for two regimes: when complex arguments like functions and variables are passed it deals with arguments' positions, but in case of simple words and phrases in quotation marks, the operator & deals with sets, which means that only the exact match is considered to be intersection. The regime of sets is considerably faster and is convenient to use when one needs to intersect two dictionaries or two wordclasses.

Examples

"a b"&"a b" matches "a b".

"a b"&"a" does not match anything because the sets' items do not coincide.

intersect("a b", "a") matches "a b", because the position of "a" intersect the position of "a b".

intersect(term(class_1), term(class_2)) = term(class_1)&term(class_2) matches words and phrases contained both in "class_1" and "class_2".