Regex to exclude images meant for print does not work

L.S.
As explained on the lingtrans wiki page, it is possible to use the location field in \fig Paratext to indicated whether an image is for Print, App or Web use. On that wiki page it refers to the document below with the following words :" Your typesetter / app creator will need to use similar rules to filter out figures that do not belong in the output that they are creating. [here]

It says there: … “In SAB, to filter out images that should not be used in an app (‘a’), use regex rules that filter out images that are only for ‘p’ or ‘w’:”

Find:

(?i)\fig ([^|]|){3}([pw]+)|[^\]\fig* Replace with nothing. (For USFM 2)

(?i)\fig [^\]\bloc="[pw]+"[^\]\fig* Replace with nothing. (For USFM 3)

Has somebody used these regexes and got them working ? I have but they do not seem to work. Any help or guidance you can give is appreciated.
Bart

btw, using sab 864

What it shows above is not the actual regex. There are missing backslashes, and some of the asterisks have been converted into italic formatting. Are you actually using these regexes?

Find:	(?i)\\fig ([^|]*\|){3}([pw]+)\|[^\\]*\\fig\*     (For USFM 2)
       	(?i)\\fig [^\\]*\bloc="[pw]+"[^\\]*\\fig\*       (For USFM 3)
Replace with nothing.

@mcquayi, we’d discussed including these two regexes in the changes gallery, but it looks like they haven’t been included there yet.

@Bart_Eenkhoorn, which are you using? USFM2 or USFM3? (You can’t tell by looking at the figure syntax in Paratext, because it displays as if it were USFM3 format, even if the files actually include USFM2 format.) Look at the book text in SAB’s Source viewer. Copy out what it shows for a verse containing a \fig field, and paste it in a reply here, on a line that begins with at least 4 spaces, so that special characters don’t get interpreted as formatting. Then check the “show after changes” box, and do the same thing with that verse. Likewise, copy and paste the changes rules that you’ve got enabled.

Here is what you requested Dan:

\v 8 Bilana ki ru, kí i Ŋuntiŋ Kirinŋ numaanni shinwulu ki rama nankɔ ki ru. Shuru fa mari i arì radu ǹʼjɛŋ nankɔ ki jirinŋ kiri kanfɛ. \fig |lll1-03_col.jpg|col|aw|Copyright © Global Recordings Network. Utilisé avec permission|xxxxxxxxxxxxxxxxxxx|3.8\fig*

The xxxxxx means this needs adding by the translator
Doing a screenshot to make sure you see the right regex formatting:

After changes:
\v 8 Bilana ki ru, kí i Ŋuntiŋ Kirinŋ numaanni shinwulu ki rama nankɔ ki ru. Shuru fa mari i arì radu ǹʼjɛŋ nankɔ ki jirinŋ kiri kanfɛ. \fig |lll1-03_col.jpg|col|aw|Copyright © Global Recordings Network. Utilisé avec permission|xxxxxxxxxxxxxxxxxxx|3.8\fig*

  1. Your text is in USFM2 format, but your changes rule is for USFM3 format. Add the other changes rule for USFM2 format, and leave this changes rule to apply in the future when you migrate to USFM3.

  2. Your changes rule is configured to filter out the app (“a”) images. That version of the rule is for print draft. You want to filter out the images that are only for print (p) or web (w), so you should have “pw” there.

Basically, what these regexes are saying is to find a \fig … \fig* field for which the location field is comprised only of one or more letters in the character set [pw], and filter such a figure field out, because it’s not intended for app output.

The (?i) part means that the match is not case-sensitive.

Thanks Dan,

Your advice was spot on and everything is working fine now, but not working in SAB at first since I had forgotten to refresh the Paratext files. (Saving in Paratext does not mean that SAB does a refresh automatically … make that mistake each time!)

I am curious to know how you can see I am using usfm 2? Do you know by heart what the minimal PT8 version is that supports usfm 3.0 ? (one of our team members preferrably has to stay on pt8 for a while still …)

Merci beaucoup,
Bart.

One more question Dan, you mentioned “the changes gallery”. Is that available somewhere ?
Thanks, Bart.

The USFM2 figure syntax separates elements with vertical bars, as in the text you provided. If you look at that text in Paratext 9 in Standard view, you’ll see how it would look in USFM3 format, with named elements like:

 \fig John dunkim Jesus|src="Mark01_09-10c.jpg" size="col" loc="aw" ref="1:9"\fig*

As for the changes gallery, right next to the button to “Add Change” is a button for “Add Change from Gallery”.

At this time, if you don’t have a compelling need for USFM3 (milestones?), you’re fine staying on USFM2.

1 Like

@Dan_Em I have added those RegEx to be added to the Changes Gallery.

Hello,

I need to enter (\xt…\xt*) for all cross references, Is there a way by using regex for that?

Dler