Word-level Fine-tuning not working as expected

Hi, I am trying to follow the instructions as detailed in the “Using Audacity” guide, for doing word-level timings. According to the documentation I can put an underscore and number at the end of the label to indicate which word should be highlighted:

For this, the labels in the timing files should
contain an underscore and then the index of the word, e.g.
1_4 starts at the 4th word of verse 1

So my timing file looks like this:

4.90 6.602 1
6.602 7.064 1_3
7.064 8.208 1_5
8.208 9.975 1_6
9.975 10.859 1_7

But when I view it in fine-tunings, it’s not aligning with the indicated words:

Is there a particular reason that you use Audacity for the audio/text synchronisation? Actually, if you do it with Aeneas and then do the fine-tuning, it should be much faster and perhaps also work better (the fine-tuning).

I am using Aeneas. I was just looking for how the timing files work, which is in the Audacity documentation. I like the phrase-level highlighting of Aeneas, and it works well, but I would like to split some long phrases. According to the documentation, this should be possible by adding word indices.

Okay, I see, that makes sense.
You could use a work-around to split phrases and still use Aeneas. For example, you can insert zero width space characters into to texts. U+200B is the Unicode. They are invisible. Then you can add this character in the Aeneas wizard as a phrase boundary character. It should find it when you sync it and in this way split the phrases. Let me know if you need a better explanation.

Thank you, that’s a good idea, however it doesn’t really work with my process. I’m using the output of Aeneas and using a custom script to break up phrases and add labels. I’d rather not alter the .sfm files directly. So I’m just trying to understand how the label file works, but the documentation appears to be unclear or incorrect.

@Andrew_Shafe would you know how to help @jmainz ?

Hmmm, I’m unfortunately not at all familiar with using Audactiy. I’ve never tried word-level timings. Did you generate the timing files using Aeneas, and use verse by verse (rather than phrase by phrase) to generate the files? And then try to add more labels using Audacity to split the verses into words? Or the other way around?

I am not using Audacity. What I’ve been doing is using Aeneas to generate timing files, and then adding additional labels into the timing files to have more fine-grained highlighting. I’ve tried with both phrase and verse splitting. But the end result is always the same, that adding the word index in the labels doesn’t highlight the correct words.