40.2. Filtering the text pasted in the document

The element template may be preceded by one or more filters separated by whitespace. These filters are used to replace some characters (or to discard some characters) in the values of variables {$line}, {$lineGroup}, {$lines} or {$field}. This facility is used for example to discard the leading bullet from a list item.

The syntax of a filter is:

separator regex_pattern separator replacement separator g?i?m?s?

Example having an empty replacement (means: discard matched characters): "/^\d\.//".

Example using the g and i flags: "^XXE^XMLmind XML Editor^gi".

The same separator character must occur three times within a filter. This character may be any character, though it's customary to use "/".

The syntax supported for the regular expression pattern is documented in http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html.

Moreover, extension character class \p{listItemBullet}, which is equivalent to:

\u2022|\u2023|\u00B7|o|\*|#|-|[\u2012-\u2015]
|\p{InBoxDrawing}|\p{InDingbats}|\p{InPrivateUseArea}
|(\p{Alnum}(\.|\)))
|(\(\p{Alnum}\))
|([ivxlcdmIVXLCDM]+\.)

may be used to match leading bullets and numeric labels in list items. Example: "/^\p{listItemBullet}\s//".

The replacement text may be empty or it may contain $0, $1, ..., $9 variables.

The final separator character may be immediately followed by one or more “flags”:

FlagDescription

g

Replace all occurrences of the matched text. By default, only the first occurrence is replaced.

i

Enable case-insensitive matching. By default, matching is case-sensitive.

m

Enable multiline mode. In multiline mode, expressions "^" and "$" match just after or just before, respectively, an end of line character or the end of the input sequence. By default, these expressions only match at the beginning and the end of the entire input sequence.

s

Enable dotall mode. In dotall mode, expression "." matches any character, including an end of line character. By default, this expression does not match end of line characters.