Not all accents recognized in Search

Question - found Search to find words such as pɛ́ma, tɔ́kɔ́, lawɔ́, dɔ̀rɨ̀, rʉ̀gɔ́, dhɨ̀rɨ̀ but not words such as Yésù, bhwà, rúgo. This does not really make sense to me why some words words are not found.

In SAB 10.2, I have all 6 boxes checked in Features > Search. I suspect something needs to be done with the “replacements and diacritics to remove if not matching accents and tones”. But do not see information in how to do this.

Thanks eh
John

I see that you found this somewhat related question:

But if you’re searching and finding some words with accents, then that’s probably not your problem.

I think the “replacements and diacritics to remove if not matching accents and tones” defines what accents and tones need to be removed from which characters when the user has said to NOT match accent and tones, i.e. ignore them. For example you would want \u0301 in that field to remove the accent on pɛ́ma, which would allow you to find that word with a search for pɛma.

But I think I might know what is happening in your case… note that the words that you can find are all special character vowels with accents, but the ones you can’t find are all “normal” vowels with accents, like é and à. The “normal” vowels with accents can all be created in two different ways. For example, é can be created as a composed character (\u00e9) or as a decomposed character (\u0065 + \u0301). The special character vowels with accents can only be created as decomposed characters. So what I think is happening is that your text has one kind of character, and your search term has the other kind of character.

Do you know which is used in your Paratext project? Probably the easiest way to tell is to go into your Paratext project, find a é character, put the cursor right after it and type Alt + X. This will show the code of the character before the cursor. If the character changes to “00E9” then it is composed. If it changes to e0301 then it is decomposed (i.e. an “e” followed by a combining acute accent). It’s your keyboard that will normally decide whether you are producing composed or decomposed forms. It’s best if you use the same keyboard for typing your text in Paratext, and also in your app. (Keyman is cross-platform.) Then you will normally get the same kind of forms. You could also try searching for special characters in the app with different keyboards - the Cameroon keyboard usually produces decomposed, and the Chad keyboard usually produces composed (where that’s possible). So you could install and do a little testing with those two keyboards to see if searches with one of them tends to work.

Why don’t you do a little research and see what you come up with, and let us know what you find.

Thanks - this is a bit out of my expertise, will consult some colleagues for way forward. I had a feeling it was something like this, and your help helped. Let you know what is found