How does pattern matching work?

The Apertium transfer engine searches for patterns in your source text. The patterns it searches for are those defined for each rule you have. Some patterns may be one word, some patterns may be multiple words. The transfer engine tries to match the longest patterns first and then progressively goes to shorter and shorter patterns. This means a five-word pattern would be used before a three-word pattern (assuming a set of words would match both patterns.) When there are multiple patterns of the same length, the first one listed gets precedence. This means in some cases the order of your rules will be important.

The transfer engine searches for matches in your words in sequential order starting with the first word. It will never find a match at some point in a text and then go back earlier in the text.

Another important concept is that the transfer engine processes words just once for whatever pattern is matched. After they are processed, the words are not examined again for any other matches. In other words, patterns cannot apply in an overlapping manner. For example, if you have a rule that matches determiner-noun and another one that matches noun-adjective, when a sentence of the form determiner-noun-adjective is processed. The engine uses the rule that is listed first and then discards the three words and continues on to other words. The second rule does not get applied. Another way to describe it is to say that the engine processes words in distinct chunks.

If the Apertium transfer engine finds a match, it runs the action part of your rule. If the engine finds no match for a word, it does default translation of it according to what is in the bilingual dictionary. Cf. “How Rules are Applied”.