I’m dealing with a language that has glottal stops (\u02c8) within words (e.g. naˈa) and I want to stop SAB from breaking words at that point. Is there a regex that will work – essentially inserting a non-breaking space without the space!
What do you mean by “breaking words”? In what context? I have built an app with glottals (U+02BC) and “tap and hold” selects the entire word as you would expect:
Furthermore, if I increase the font size until “gaaʼidiin” can no longer fit on the line, the whole word wraps to the following line. So I don’t see any breaking.
You said you are using ‘MODIFIER LETTER VERTICAL LINE’ (U+02C8) whereas I’m using ‘MODIFIER LETTER APOSTROPHE’ (U+02BC), but they have (for the purposes of word breaking) the same characteristics, so there shouldn’t be any breaking with that character either.
If you see breaking (like in the two situations I mention above), then there is a bug in the App Builders. All modifier letters (those with Unicode category of [Letter, Modifier [Lm]]) should be non-breaking.
[To the developers: This is an issue related to the “ignoring Arabic script vowel in search” issue. IMHO I think pretty much all of the character processing that apps do should be based on the Unicode character properties, and not some list of characters that is created manually, as those are doomed to forget some character that someone will eventually use, like MODIFIER LETTER VERTICAL LINE, in this case.]
Sorry I wasn’t clear. The words with the specific glottal chars (u02c8 in this case) are word-breaking at the glottal char when the word potentially wraps (when sizing the screen), instead of the whole word moving together, so “gaaʼidiin” in your case becomes