Stripping out \li markers before typesetting

John_Nystrom · August 17, 2022, 2:13pm

Lists of vices and lists of qualities can be tedious and difficult as we try to find a unique rendering for each item in the list. Then in a long list it can be easy to lose track of which item you are looking at in the translation. It can help to turn on Biblical Terms Renderings highlighting, but in some cases we haven’t really decided on the correct rendering yet.

So we have resorted to doing this:

\li \add sexual immorality/πορνεία\add*pasin iyn toro temiyn ke wuruaw tana,

\li \add impurity/ἀκαθαρσία \add*pasin koro teiynjieiyn ektek ono,

\li \add debauchery/ἀσέλγεια\add* ke pasin iyn etere takai ektekna iyn toro re pasin kokelek ektek ono,

The different formatting of \add…\add* makes it easy to see which part of the verse is rendering which Greek term.

Eventually we’ll strip all that extra stuff out before sending the text to the DBL. But in the meantime, we want to keep it in the text. We include the \li paragraphs to make it even easier to see where one item starts and ends.

I can easily mark \add…\add* as nonpublishable in custom.sty and tell PTXPrint not to print it.

But I don’t know how to tell PTXPrint to strip out the \li markers in PrintDraftChanges.txt.

Thanks for any help you can give me on this.

John Nystrom

mjames · August 20, 2022, 9:17am

I think this should do it (and wouldn’t necessitate you setting the \add marker as nonpublishable)
'\\li\s.*?\\add\*'>''

John_Nystrom · August 20, 2022, 12:12pm

Thanks, mjames. This worked to strip out the \li marker and everything to the \add* marker in paragraphs that contain the \li marker and the \add* marker.

What would the command look like to strip out all the \li markers in the text regardless of whether they have \add…\add* in them?

We have some places where we have put in the \li markers to make a list, but we have not put in the \add…\add* markers with the terms we are translating. I would like this process to take those \li markers also. Eliminating the \add…\add* sequence is already easy in PTXPrint so I can handle that as a separate issue.

Thanks again for your help.

And for anybody else reading this, if you have a better way to accomplish what what we’re trying to do–making it easy for translators and consultants to distinguish the items in long lists from each other when looking at them in Paratext, but printing the paragraphs normally in PTXPrint, we’re always open to suggestions. So far we have not needed \li for anything else, but the day will come when we do. So we’re open to suggestions on how to do this a better way.

JN

mjames · August 21, 2022, 1:58pm

If all you’re trying to do is remove the \li marker, but leave all the text after it alone, then you can simply use '\\li'>''

These sorts of change rules are applied in order, so if you put them in this order
'\\li\s.*?\\add\*'>''
'\\li'>''
it will first strip out the lines that inculde \li and also the \add sections, and then after that strip out lone \li markers which are left over.

If you were to put them in the opposite order
'\\li'>''
'\\li\s.*?\\add\*'>''
then it would remove all the \li markers and then the second rule would fail to find anything. This just serves as a word of warning that sometimes rules will interfere with each other and you have to keep track of the order they’re in.

Phillip_Leckrone · August 22, 2022, 7:00pm

John,

You can do this more easily in RegEx Pal. To remove everything from \li to the end of the line, your find would be:

\\li .*?\r\n

The replace would be blank

This doesn’t work in the Find/Replace dialog because of the way Paratext strips out the return and end of line. However, it does work in RegEx Pal.

John_Nystrom · August 22, 2022, 7:35pm

Thanks, mjames. That was simple. I obviously need to refresh my knowledge of regular expressions.

John_Nystrom · August 22, 2022, 7:36pm

Thanks, Phil. I need these to work in PrintDraftChanges.txt so we can use it when we need a clean PDF of the passage, and leave the markers in the text while we’re still editing it. Eventually we’ll take them all out permanently.

JN

John_Nystrom · August 25, 2022, 5:07pm

Trying a different approach and wondering if it matters anyway

A reader on this forum pointed out to me that I am misusing \add…\add* by doing our lists this way. So now I’m trying to use \rem instead. I am experimenting with lines like this in the text:

\rem πορνεία/sexual immorality
\li pasin iyn toro temiyn ke wuruaw tana,
\rem ἀκαθαρσία/impurity
\li re pasin koro teiynjieiyn ektek ono,
\rem πάθος/lust
\li pasin iyn toroꞌa re niy etere iyn awro pasin kokelek,

I tried to make a regex that would take out the \rem lines the way mjames gave me a way to take out the \li lines containing \add…\add* earlier in this thread. I have tried and failed using these combinations:

‘\rem .*?\li’>‘\li’
‘\li’>‘’

and

‘\rem .*?\r\n’>‘’
‘\li’>‘’

Those \rem markers are all escaped, but this platform is stripping them out.

When I used what mjames gave me and I used \add…\add*, I was able to print the lists as lists, which is helpful in checking, and I could also print the lists collapsed into normal paragraphs, which is our goal for publishing trial editions. All I can do now is print them as lists. Apparently, I’m not really getting the \rem lines out, so when I remove the \li markers, it’s all one \rem paragraph and therefore the contents of those lines doesn’t make it to the PDF. So maybe all I need is a better regex to do that in PrintDraftChanges and all will be well.

I want to retain the options to print the lists as lists or as paragraphs. I also want to retain the markings in the text, even after DBL submission, so we can have those helpful markings when we revise these books later before a complete NT is published.

But I know that when I submit these books to the DBL, all that custom marking must be gone.

So when I’m ready to submit books to the DBL, I wonder if this would work:

Mark a point in the project history (“before cleanup for DBL submission”).
Remove all custom markup.
Mark a point in the project history again (“DBL submission”).
Submit to DBL.
Revert those books back to (“before cleanup for DBL submission”).
Continue to work on the text as before, continue to print it locally as needed, in whatever form needed.

If that is a viable strategy, then I wonder if it matters whether I abuse USFM by using \add…\add* and \li or whether I abuse it by interleaving \rem lines with \li lines in the middle of a paragraph? If it doesn’t matter, then I guess we will use the markup strategy that is the most helpful for both using the text and submitting it to DBL. I’m not sure which method that is. I’m open to suggestions.

I’m leaning toward doing this:

\li \add sexual immorality/πορνεία\add*
\li pasin iyn toro temiyn ke wuruaw tana,
\li \add impurity/ἀκαθαρσία \add*
\i pasin koro teiynjieiyn ektek ono,
\li \add debauchery/ἀσέλγεια\add*
\li ke pasin iyn etere takai ektekna iyn toro re pasin kokelek ektek ono,

I originally chose \add…\add* because it is a character style and we won’t use it in our translations since it is only designed to mark something we would never mark in our translations.

Whatever I do, we’re going to do the same thing in at least 12 translations in our cluster. So I need to get this right.

Please feel free to suggest an entirely different approach to this problem.

Thanks for any help you can give me on this, and thank you for reading this far.

John Nystrom
Aitape West Translation Programme
SIL PNG

John_Nystrom · August 25, 2022, 9:34pm

Thank you to all who helped with this.

This is what I’m doing now:

\rem πορνεία/sexual immorality
\nb pasin iyn toro temiyn ke wuruaw tana,
\rem ἀκαθαρσία/impurity
\nb re pasin koro teiynjieiyn ektek ono,

When I run this through PTXPrint, it ignores the \rem paragraphs. The \nb paragraphs are supposed to be “no break” so PTXPrint prints the text without any breaks where the \nb markers are. So I get the single paragraph output I want and I can see these lines as separate paragraphs as we work on them in Paratext.

I tested using a simple regex to replace \nb with \li to print the traits as separate paragraphs for checking and it worked.

The advantage of using \nb is it is rarely used anyway, and we are not using it at all. Theoretically, I could leave them in the text indefinitely, but we’ll take them out for DBL submission anyway.

JN