There are three types of dictionary supported by XMLmind Spell Checker:
Compiled dictionaries (.cdi
), which are the main topic of this documentation, are a fast and memory-efficient representation of large word lists.
A Composite Dictionary allows to present a group of dictionaries as a single dictionary (more precisely the resulting dictionary is the union of all its components). For example a country variant like en-US
aggregates a big English common base with a smaller US-specific dictionary. This mechanism reduces memory and disk usage.
A Composite Dictionary appears as a simple text file beginning with the magic line "@multilink:
", followed by lines containing the URL of sub-dictionaries. URL are generally relative to the composite dictionary, but can also be absolute. Referenced dictionaries can in turn be Composite.
For example, this can be the 'default
' dictionary for en-US
(it refers to the common English dictionary en/base.cdi
).
@multilink: ../en/base.cdi spec.cdi
There are also plain text dictionaries, used for example for personal dictionaries. Their structure is very simple: one word per line. The default text encoding is UTF-8, but XMLmind dictionaries use ISO-8859-1.
Text dictionaries use much more memory than compiled dictionaries, and are reloaded each time they are selected (while compiled dictionaries are managed in a cache). Therefore the use of text dictionaries should be reduced to small amounts of words (a few hundreds at most).
Through the Composite Dictionaries mechanism explained above, it is possible to extend a dictionary in a simple way. For example, one can define a second dictionary named 'extended' beside the 'default' dictionary with this contents:
@multilink: default ?http://www.dictionaries.com/myextensions.cdi
Here we have an absolute (and imaginary) HTTP URL pointing to a compiled dictionary.
Notice the (optional) question mark preceding the URL: it is a protection against an access failure: if the read operation fails, no exception is raised and the dictionary loading proceeds as if the URL were absent.