0 votes

Given that teams are starting to use Glyssen to pre-process their NT texts to identify each speaker in the text for a dramatized recording, I wondered whether it would be worth piggy-backing on that work to AUTOMATICALLY mark up the Words of Jesus \wj … \wj* in the Paratext project.

Glyssen produces an Excel file with the needed data in it, which looks like this:


So it should be relatively easy to generate a script to find these places in the text and wrap them with the \wj … markup …\wj*

I tried something simple myself with a couple of generated CC tables, and managed to get 88% of the 1901 occurrences of the words of Jesus marked up successfully. However, this still leaves 200+ places where I would need to go in and fix it manually. Being unsatisfied with this result, and wanting to save others time in the future by making it automated, complete and more foolproof, I wondered about creating a Custom Script (in Python) that could do this task directly from within Paratext.

Most of the “failed” cases with the CC method are because of other markup like \w angel|angels\w* or \f footnotes and \x cross-references being embedded within the text being searched for. I’ve got some ideas about how to get around those using Regular Expressions (and searching for the start and end of strings rather than the whole string, etc.) but that is beyond CC and would need to use some Python code to make it possible.

BUT, before I attempt to make a Custom Script to do this, I’m wondering if anyone else has done something similar already. I don’t want to re-invent the wheel! Looking at the sample scripts shipped with Paratext, the closest thing I see is TransferParallelPassageRefs.py by DRM.

Does anyone else have something similar, or is anyone else more gifted at Python programming with ScriptureObjects who could pull this together better than I ever could? I’m willing to write the pseudo code if that’s helpful. And I’m willing to learn and work with someone else on this…

Paratext by (2.4k points)

2 Answers

0 votes

If you figure this out, I’d really like to see it.
But it would make all our lives easier, if PT would allow \w \w* to span across verse number references, footnotes, and xrefs. It is tedious (and error prone) to start and stop the \w markings around things that have nothing to do with the actual words of Christ. (In fact, it’d be nice if the \w style went across paragraph marks too, and would only throw an error if the end of a chapter was reached without seeing a \w*.)

by (363 points)

I doubt if this will ever happen since \wj … \wj* is a character style and it would require redefining character style behavior. You could, however, consider creating a custom paragraph style in the custom.sty stylesheet (see Paratext Helps for more info on custom stylesheets).

Blessings,

Shegnada James

+1 vote

There are a series of tools at https://lingtran.net/Voice-Marking-Tools that can be used to help with this process.

by (8.0k points)

Related questions

0 votes
1 answer
0 votes
0 answers
Paratext Mar 31, 2021 asked by [Moderator]
james_post
(2.1k points)
0 votes
4 answers
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Accept the one whose faith is weak, without quarreling over disputable matters.
Romans 14:1
2,479 questions
5,174 answers
4,872 comments
1,283 users