0 votes

(Phil Leckrone writes)

a) how much of:

    1. Alpha; 2. Beta; 3. Gamma

will:

   \d.*\d

match?

b) Can RegEx Pal match text that spans more than one verse?

I’m checking a project (the keying in of an older text) where the key-er has started every verse of direct speech with a open quote continuer – which is not in the original, and not correct for this language. If RegEx Pal cannot span verses, I guess there’s no way to fix this automatically … but it will be a lot of work to correct it manually.

Paratext by (646 points)

2 Answers

0 votes

(Phil Leckrone writes)

  1. Regular Expressions tend to be “greedy” in general. If you use * then you will continue to search as long as possible until you come to the next item. If you want to make a search “not greedy” then you would use *? to stop after the first possible match.

  2. A regular expression can span verses or other markers - this depends on what you are using to search with. In RegExPal, the “.” stops at a newline. Therefore it won’t span a verse if there is a newline before the \v.

You can make a dot match a newline if you start the regular expression with “(?s)”.

NOTE: if you search using “regex:” in Paratext then the “.” does not automatically stop at a line break. To make it stop at a newline you need to use: “regex:(?-s)”

If you Google “RegEx Cheat Sheet” you can find some useful guides.

by (646 points)
0 votes

a)

\d.*\d

will match

  1. Alpha; 2. Beta; 3

(and maybe more if there is another number in the text line) since .* is greedy.

\d.*?\d 

will match

1. Alpha; 2

always.

b) Yes. there are many way to span a verse (RegexPal cannot span a chapter)
method 1: use (?s).*?
method 2: use an expression to find everything except the character you want to find and then put the character you want to find after that:

For example, this is how I would start to find double open quote continuers:

(?<=“[^“”]*?)“(?=[^”]*?”)

and then

(?<=“[^“”]*?)“

The first will will not find continuers at the beginning of a chapter nor continuers in chapters that continue to a following chapter
The second will pick up continuers in chapters that continue to a following chapter and then you will need to manually delete the continuer at the beginning of the following chapter.

If there are quotes in footnotes, this may not work properly. One solution would be to change all quotes in footnotes to different characters, then change them back after removing the continuers from the text. This then would be the whole process:

(?<=\\f\s).*?(?=\\f\*):::“ to <<
(?<=\\f\s).*?(?=\\f\*):::” to >>
(?<=“[^“”]*?)“(?=[^”]*?”) to <nothing> (repeat this until no more continuers are found)
(?<=“[^“”]*?)“  Use this to find chapters that begin with continuers and do manual cleanup
(?<=\\f\s).*?(?=\\f\*):::<< to “
(?<=\\f\s).*?(?=\\f\*):::>> to ”
by (1.8k points)
reshown

Related questions

Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
For where two or three gather in my name, there am I with them.
Matthew 18:20
2,627 questions
5,369 answers
5,043 comments
1,420 users