+1 vote

I want to change lowercase to uppercase in certain situations - for instance I want to change all lowercase after a colon to uppercase. There is no single rule that I can use in RegEx Pal to accomplish this, but there is an undocumented feature that lets me create a list of simple changes to run. (NOTE: there are regular expressions that can do this, but not in RegEx Pal.)

  • Create a text document with entries that follow this format:
# I can use the # to add comments to the list
: a-->: A
: b-->: B
: c-->: C

What is on the left is separated from what is on the right by -->

  • Once you save this file you can use it in the search window of RegEx Pal by typing:
-->{Path}\{Filename}
For instance:
-->c:\My Paratext 8 Projects\zzProj\shared\changes.txt
  • When the file runs it will highlight text found in the entire chapter. If you run this in a Find/Replace then it will highlight the first item found and remaining text, but in the replace panel it will show that it is doing correct replacements. The replace window should be left blank since the list is providing the replacement

NOTE My experiments with this showed that I could not use complex Unicode characters. If I wanted to do a change like:

Jesus-->Jesús

I had to save the text file in ANSI format or else the ú would be corrupted in the text.
Also
While I could include spaces on the left side of the hyphen hyphen, I could not include a space at the end of the right side - it would get removed in the processing. And if I wanted to search on the left for a period or question mark, these had to be escaped first:

\. a-->. A
\? a-->? A

There are probably other things that may not work quite like expected. If you want to try this, I suggest starting with a very simple list and then adding to it as you go.

Paratext by (8.4k points)
reshown

5 Answers

0 votes
Best answer

Did you mean ASCII format?

by [Expert]
(16.2k points)

Did you mean ASCII format?

It probably is ASCII format, but when I’m Notepad++ I don’t have an option to convert to ASCII - I do have an option to convert to ANSI and that works for basic characters.

I find Notepad++ to be more helpful to me than RegexEx Pal for global changes. The regex expressions are more like Adobe InDesign’s GREP (with which I’m familiar), and NPP’s searches can use “.+” without going to the next line, which is very helpful in finding phenomena that happen within a verse.

Using the “Find in Files” tab of the Find dialog box, I set the “Filters” to “.sfm” and select the PT project folder. in the “Directory” dropdown. PT recognizes these changes in the Project History, too!

I don’t know how best to run a list of changes on NPP. Any suggestions would be appreciated.

I find EditPad Lite’s regex find/replace to be quite helpful and prefer it to Notepad++. To each their own!

If we’re going to give our favourite RegEx implementation then BBEdit would have to be mine!

And I have recently switched to Visual Studio Code.

0 votes

One other caution with using a list in this manner is that the list is not linear, but each item gets applied to everything. If I have a simple table like this:

c-->k
k-->q

The result will not be that all c become k and stop and all original k become q, but all c become q.
Here is an image of the result on text:

by (8.4k points)
0 votes

Interesting, I’ve never heard that. A little research and it seems that ANSI format is the proper name for what I have always known as Extended ASCII format.
You learn something new every day. :grinning:

by [Expert]
(16.2k points)
0 votes

Very interesting , thank you. several RegEx in a row sound like the ultimate challenge.

Did you try to use \unnnn or other encoding “tools” to treat real language data?

like:

elephant-->abʊrʊ

elephant-->ab\u028Ar\u028A

If this option were limited to ANSI, it would not be much help for projects in west Africa, I fear.

by (855 points)
0 votes

I found that the path name could not contain spaces. Is that still the case?

As far as the encoding, if you save the document as UTF-8 then you can use either Unicode Characters or character codes
ʼ–>’
‘\u02BC’-->’\u2019’

by (1.8k points)
reshown

My tests show that I can use a path name that includes spaces, as in: “C:\My Paratext 8 Projects…”.
When I try to run the change above, RegEx Pal finds the \u02bc, but returns the literal \u2019 (file saved as UTF8)

Related questions

0 votes
5 answers
0 votes
3 answers
Paratext Mar 15, 2018 asked by SIL LSS PNG (411 points)
0 votes
10 answers
Paratext Sep 10, 2015 asked by wdavidhj (1.4k points)
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Accept the one whose faith is weak, without quarreling over disputable matters.
Romans 14:1
2,664 questions
5,423 answers
5,083 comments
1,485 users