Using dictionaries
Automated text analysis tools that process text symbolically can fall short of providing valuable results. You can use dictionaries to bias how the algorithms interpret word symbols. Dictionaries help users improve the performance of natural language processing tools. For example, you can specify a synonym relationship, so that two symbols are interpreted as the same word. Or, you can specify a word that, despite metrics suggesting the word is useful, should be ignored.
Bear in mind that spending an extensive amount of time configuring a dictionary will not always improve the fruitfulness of your results. Large dictionaries are difficult to manage, and may only provide marginal gains in insight.
What is a dictionary?
A PolyAnalyst dictionary consists of a list of words together with some properties about each word. A dictionary may specify relations between words, such as synonyms, or combine words with some common features.
Which PolyAnalyst features make use of dictionaries
PolyAnalyst text analysis nodes such as the Spell Check node and others require dictionaries in order to function effectively. For many other nodes, using a dictionary is optional. Moreover, you may find it helpful to compare the results of a node using a dictionary and without one.
Dictionary languages
Dictionaries are language-specific. Each dictionary you create is assigned to a language, which by default is English. You can use dictionaries assigned to different languages in the same analysis.
Additional dictionary facts
Most dictionaries are editable using PolyAnalyst’s dictionary editor. There is also a window for browsing the available dictionaries, where you can perform actions such as deleting or renaming a dictionary.
You can freely modify the default dictionaries, delete the default dictionaries, and copy them. Nevertheless, we recommend that instead of modifying one of the default dictionaries that you create a copy of it and then only modify the copy. This ensures that it is easy to return to the original state of the PolyAnalyst 6.5 dictionaries. |
Dictionaries are shareable. You can share a dictionary you have created with other users. You can use the same dictionary in different analytical projects.
Default dictionaries
PolyAnalyst 6.5 comes with several default dictionaries. In addition to the default dictionaries, you can create your own custom dictionaries. Besides, you can obtain other default dictionaries for different languages from the Megaputer support upon request. Due to the file sizes of these dictionaries, and that the dictionaries are not relevant to all of our users, these dictionaries are not shipped with the basic installation.
In general, nodes that use dictionaries are configured to work with the appropriate dictionaries by default. You can customize the dictionaries used by a node on the Dictionaries tab in the node’s properties where you can view which dictionaries are used by the node, and check or uncheck the dictionaries to use.
For example, this what the Index node settings look like:

As you can see on the screenshot above the Index node uses only one dictionary, i.e. the Morphology dictionary. Other nodes use a different number of dictionaries. The dictionaries configuration and types are described further.
Default dictionaries path
A path to the default PolyAnalyst dictionaries is specified in the Adminstrative Tool settings.

Whether the dictionaries you work with are given in a different folder, it is needed to change a value of the Dictionaries folder
parameter.
It is also possible to specify an additional path for the dictionaries changes, i.e. all changes of a dictionary will be recorded in a separate folder.
To do this, specify the Dictionaries extension folder
parameter.

It should be borne in mind that restarting of the PolyAnalyst server is required.
The folder specified in the Dictionaries extension folder parameter must be created beforehand.
|