0 votes
I'd like to check all numbers in our translation, whether they are written in words (e.g. twenty-seven) or in digits (e.g. 27).

Is there a way to get a list of all these numbers, e.g. in the Biblical Terms?

Thanks for any help!
Paratext by (234 points)

2 Answers

0 votes

There are several ways to do this, depending on what exactly is your end goal, but here's one way to get a scrollable list of verse snippits containing a number. 

Open the Find/Replace dialog (Ctrl-F) and select the Basic Search tab. Type this into the Find: box

regex:(?<!\w|\\[cv] )(one|two|three|four|five|six|seven|eight|nine|[0-9])

Select Match in: Verse text and check Show Verse Context then click Find.

The Regular Expression above will find "one" OR "two" OR "three" .... OR a digit between 0 and 9, but only if they're neither following a letter (\w) nor following a "\c" or a "\v".

By selecting Match in Verse text the Find will skip all footnotes, cross-references, and figure captions. You may want to Match in All text instead.

You may also find it helpful to select Project settings ... Number settings and check the listed rules, then Run Basic Checks with Numbers selected.

The Wordlist tool may also be helpful for you.

by (594 points)

Thank you very much for this!

I just tried it out, but it doesn't quite do yet what I would like it to do. 

Here is my RegEx, as I adapted it:
regex:(?<!\w|\\[cv] )(kopo|menas|misika|foa|faev|siks|seven|eit|nain|ten|ileven|tuelv|tetin|fotin|fiftin|seventin|eitin|naintin|tuenti|teti|foti|fifti|siksti|seventi|eiti|nainti|andled|taosen|[0-9])

As it is, this finds far more instances than it should, for example, verb prefixes starting with ‹ten›. It would be great if we could limit the search to words being:
- at the beginning of a word or preceded by a hyphen
- and at the end of word or followed by a hyphen

On the other hand, since we have grammatical tone on clause level, most of the vowels can end up with a diacritic (áàāa᷄â), but such occurrences are not found. It seems that whenever I use RegEx in Paratext's Find, Ignore Diacritics And Vowel Points, does not work, and therefore Paratext doesn't find ‘sevén’, for example.

Any idea how I could solve these issues?

Thanks again!

PS: I'm aware of the Numbers Settings and it is all set up for the NT, but there doesn't seem to be a way to include numbers written in words. (Or is there?)

A few more ideas:

If your project uses all composed characters, then you could change each "a" in your RegEx find to [áàāa᷄â]. If you use all decomposed characters then you could do something like a[`'^]. But your list above has a mix of composed and decomposed forms, so the RegEx would get pretty complicated.

You could make a copy of your project in which you remove all accents and then do the searches in there.

Regarding "Ignore diacritics", note what Paratext Help says:

If you select Ignore Diacritics And Vowel Points, Paratext treats any character variations with diacritics (including accent marks, vowel points, or breathing marks) as equivalent to the characters entered in the Find box.
For example: enter "man", and Paratext will find "man" or "mán" or "män". The entries in the Alphabetic Characters tab of the Language Settings for the language used in the project define how this option works. Characters found on the same line in the Alphabetic Characters tab are considered equivalent (e.g. a á ä).
Note the "Add to List" and "Remove from List" options in the Find dialog. I've not had occasion to use these could simplify your searches. 
Adding a (?!\w) to the end of the RegEx would tell it to ignore all matches which are immediately followed by another letter (thus allowing matches which are followed by space or punctuation.) This would ignore the "ten" verb prefix.
For testing purposes, I use this expression in the count/extract tool of RegExPal with Unique checked. This shows a short list of each of the items found, making it easy to see that my expression is not finding anything which I don't want included (such as words which contain numbers like "tension" or "bone").
(?<!\w|\\[cv] )(one|two|three|four|five|six|seven|eight|nine|[0-9])(?!\w)

+1 vote
The Biblical Terms tool has a list of Numbers used in the NT. I am not aware of an equivilant list for the OT.
by (8.4k points)
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Live in harmony with one another. Do not be proud, but be willing to associate with people of low position. Do not be conceited.
Romans 12:16
2,628 questions
5,369 answers
5,045 comments
1,420 users