0 votes
I need to find out what percentage of each book has been drafted. A count of how many verses in each book have text in them would be close enough. Can anybody write a regex that would do this?

Thanks for any help you can give me on this.

John Nystrom
Paratext by (296 points)

2 Answers

0 votes
Is something as simple as this what you're looking for, or are you looking for some more complex?
\\v \d+ .

PT doesn't catch carriage returns with . and so I think that would work. Otherwise:
\\v \d+ \w

But that wouldn't catch verses that start with numbers or punctuation.
by (1.8k points)
mjames,

Thank you for this. This does count non-empty verses. I'm trying to answer these questions: What percentage of the OT is drafted in this project? What percentage of the NT is drafted? It's easy to do when I know an entire book is drafted, but the team I'm most concerned about is doing an OT panorama, so they have drafted a bit here and a bit there.

I would love to be able to run something one time that would count the non-empty verses in all books and give me the results by book and a total so I don't have to do this 66 times for each project for which I need this info. So yes, I am looking for something a bit more complex and I understand this is not a simple thing. I was hoping maybe somebody had already done this or knows of an existing tool that can do it.
Do you need the information at both a book level and also at the testament level? If you're mainly concerned about the whole OT or NT, why not just limit the books you're searching in PT to one of those two options? So search the whole NT for that regex, not just one book?

I think if you wanted to get a result which listed each book's count and a total count, all in one single command, then it would probably require an external script. I don't think PT has the capability to give multiple results like "each book count + a total count".
To answer the question somebody else is asking me, which got me started on this, i.e. how much of the OT is drafted, yes, I can just use that regex and run it on the OT. But it would be useful to me to know the info on the book level as well. If such a tool existed, I would use it periodically in several projects.
When I do this in the Paratext project, I get 16,886 items. When I do it in RegexPal, I get 16,677. I don't know why I get this discrepancy. For what I need the number for, it's not a big deal, but I think it's odd that it's not the same number.
Were you using the regex with the . in it? My guess is that PT doesn't match the . with line endings, but RegexPal does. This means if you ever have
\v_# (with no space after the # and a carriage return immediately)
then PT will not count it (since . isn't getting caught) but RegexPal will count it (since . is catching the carriage return).
0 votes

A single regex may not be enough. You could try to adapt a script from the Custom Tools menu, i.e. the My Paratext Projects/cms/ subdirectory.

by (834 points)
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Don’t you know that you yourselves are God’s temple and that God’s Spirit dwells in your midst?
1 Corinthians 3:16
2,645 questions
5,394 answers
5,065 comments
1,437 users