0 votes

Dear reader. I would like to find \fq followed by any string thing including ɛ ɔ ɲ ŋ until the first \\ft followed by a space and then not followed by a colon.

Exemple phrase:

\fq cintɔnŋ Natan i ta ŋuŋ kɔnkɔli bo Solomani ŋununu \ft Walima ŋuŋ kɔri Solomani ŋununu

This regex \\fq.*\\ft\s[^\:] is much too greedy .... Any tips on how to limit the search to the first non-occurance?

Thanks in advance,

Bart

Paratext by (322 points)

2 Answers

0 votes
Adding a ? after the * is used for non-greedy.
by (8.4k points)
Thanks for the tip Phil. I am still not there yet .... so I tried: regex:\\fq.*?\\ft\s[^\:] which results in (see below)

The search for  regex:\\fq.*?\\ft\s\: produces a nice list of results with \ft followed by a space and a colon. Indeed, after \ft there should be a space \s always followed by a colon. So, what should the regex be to find the absence of that one colon? My best guess is : regex:\\fq.*?\\ft\s[^\:] but it does not deliver the desired result ...

Any ideas?

Thanks in advance.
0 votes
When writing a regular expression it is important not to look for too much. When you search for a period it keeps look for anything that is not a new line (or if you are in the regex: search it will look for anything) until it comes to the next item. So the key is to try and search for something stops the search. Often I will use [^\\]* to search for anything that is not a backslash. This would work for you unless you have extra backslashes in your \fq before you get to the \ft. Note that I try to build my expressions in RegExPal so that I can see what is being found as I add to the expression.

Try this expression:

\\fq [^\\]*?\\ft\s[^:]
by (8.4k points)

Thank you Phil, that did find what I was looking for. I was also mistaken about the : which is not a metacharacter in regex. Actually, the regex \\fq [^\\]*?\\ft\s[^:] and the regex \\fq [^\\]*?\\ft\s[^\:] deliver the same (good) result, which I am trying to understand. So, just for the sake of a better understanding of this regex in PT:

does \s[^:] mean the first non-colon after a space?

then:

would \s[^\:] mean the first non-backslash followed by a colon? Or does it mean the first non-backslash followed by a non-colon?

Much obliged,

Bart.

When you put characters inside of the [] regex sees those as alternates (except for the ^ which is the NOT). So [^\:] says not a \ or not a :.  If you have something like t[aeiou]t  it would find a t followed by any of those letters and then another t.
Merci beaucoup.

Related questions

0 votes
2 answers
0 votes
2 answers
+1 vote
5 answers
0 votes
3 answers
Paratext Mar 15, 2018 asked by SIL LSS PNG (411 points)
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Just as a body, though one, has many parts, but all its many parts form one body, so it is with Christ.
1 Corinthians 12:12
2,664 questions
5,423 answers
5,083 comments
1,485 users