0 votes

I’m working with a team who want to write out scripture references like this:
\xt book_name rukuh # ayat #\xt*
where “rukuh” means “chapter” and “ayat” means “verse”.

In many cases we’ve just given up getting PT to parse the names and use the |ABC 1:2 style attributes to explicitly give the reference. Still, PT does a good job in many cases of parsing correctly.

To get this to work we’ve added the word “rukuh” onto the end of each book name and defined ␣ayat␣ as the Chapter/Verse separator.

First of all, is there a better way to set this up that we haven’t thought of? Or should we just give up and use English style references as an attribute in all cases?

Assuming what we’ve done is the best possible way of doing things, adding footnotes now sets the \fr field to chapter# ayat verse#, which is annoying but could be worked around. The problem is that when we run the Reference Check, we get hundreds of errors that say “Unexpected book after chapter separator”. Is there any way to get around that?

Paratext by (1.8k points)

2 Answers

0 votes
Best answer

For a project we did a couple of years ago which needed to do the same sort of thing, we put the references in normal Western style references, so that Paratext (and the DBL) is happy, and can do all of the checks it needs to do. But before publishing we processed with a Python script to convert them to the proper format. Here are the relevant Regular Expressions from the python script (note that there is a separate conversion expression for each form of the reference, whether multiple chapters, with verse numbers or no, etc.):

    # turn cross references into their long forms
    #  Takwiin 2:1; 5:2  or  Takwiin 2:1-5; 5:6-9
    sText = re.sub(r'(\\\+xt (\d )?[A-Za-zʼ -]+) (\d+):(\d+(-\d+)?); (\d+):(\d+(-\d+)?\\\+xt\*)',
                   '\\1 fasul \\3 aaya \\4 wa fasul \\6 aaya \\7\\\+xt*', sText)
    #  Takwiin 2; 4
    sText = re.sub(r'(\\\+xt (\d )?[A-Za-zʼ -]+) (\d+); (\d+\\\+xt\*)',
                   '\\1 fasul \\3 wa fasul \\4', sText)
    #  Takwiin 2:1
    sText = re.sub(r'(\\\+xt (\d )?[A-Za-zʼ -]+) (\d+):(\d+\\\+xt\*)',
                   '\\1 fasul \\3 aaya \\4', sText)
    #  Takwiin 2:1-5
    sText = re.sub(r'(\\\+xt (\d )?[A-Za-zʼ -]+) (\d+):(\d+-\d+\\\+xt\*)',
                   '\\1 fasul \\3 aaya \\4', sText)
    #  Takwiin 2  or  Takwiin 1–2
    sText = re.sub(r'(\\\+xt (\d )?[A-Za-zʼ -]+) (\d+(–\d+)?\\\+xt\*)',
                   '\\1 fasul \\3', sText)

If I was doing the same typesetting today, I would use PTXprint, and put those changes in the Changes.txt file.

Note that there may be chapter/verse expressions that aren’t handled by these expressions; this was for a specific project and I knew what the range of all of the reference expression formats was. For example, I don’t believe this would handle three chapter numbers, e.g. “bookname ch1; ch2; ch3”. If you have that in your project, you would need to add a new expression, or find a way to extend the existing expressions (probably the former is easier…).

Note also that when you look at your references in text coming out of the DBL, e.g. in YouVersion, you will just get the Western reference format, not your “rukuh # ayat #”. That’s because YouVersion doesn’t have any sophistication around processing and formatting text - it just dumps it out in HTML, and hope it looks good. But if you create Scripture apps with SAB, you can run similar Regular Expressions to get that same desired behavior.

@anon175865 In some ways this YouVersion problem is a similar to the problem of messing up line breaks when there are spaces around punctuation. Maybe we should have a “suggested typesetting rules” file that gets attached to the project and stored with the project in the DBL. Then if YouVersion applied those changes when it’s displaying a text, it might do a better job. That would in a sense allow a difference between the data format and the presentation format of the text. And then, depending on the type of presentation, you may need to choose what sorts of rules to apply. Anyway, just greenlighting a bit there…

Hope that helps,
jeffh

by (1.3k points)

Thank you very much for that suggestion. It seems like a better work-around than our current one. I’ve suggested it to the team and will see if they’re happy with it (the one disadvantage is that they will need to get used to seeing one thing in PT but expecting another thing in the final print/app).

Just a reminder that with PTXprint it is now possible to quite easily produce your final form output, so don’t be afraid to produce drafts and evaluate them in PDF or printed form. Obviously the translators (or at least one of them) will need to be able to work with/keyboard the Western format of the reference, but you should be able to produce drafts in the other format for evaluation by the team, your committee, etc. This has the added advantage of testing your final format earlier in the process, so everyone gets used to it and/or makes their comments on the format early on.

@jeffh The issue with display on our publishers platforms is not that they “just dump it out in HTML, and hope it looks good”. My goodness!! You have no idea how much work YouVersion, in particular, goes to to get things the way we want them. The bundle that goes to them is produced by our Paratext converter, and is USX, not HTML (Paratext is also where the no-break spaces are stripped out). I believe that, because of the nature of USX, individualization is not permitted. I do know that where we have had individual issues that could be corrected at YouVersion’s end, they have gone above and beyond what they needed to do to make things right. I would hope that we could celebrate and not denigrate partnerships in our line of work. Your suggestion for typesetting rules should probably be made to the Paratext team, as they control the content of the bundles that our publishers receive.

@anon175865 You are absolutely right. My comment was very unfair, and I apologize for it. A little bit of frustration leaking through, from all of the times that I’ve seen French Bible texts in YouVersion with a bad line break… As you suggest, the problem is really on the DBL end, and hopefully the efforts being made in Paratext and DBL will address some of those issues. Forgive me for my careless comment.

@jeffh Thanks jeffh. I understand your frustration, and will continue to work to make things better on the DBL end.

0 votes

I’m interested in what you decide to do. If you keep the work-around to the Scripture Reference Settings, I’d appreciate it if you would add me as a Consultant to the project so I can test whether or not the converter that Paratext uses to convert the text from USFM to USX for distribution through the DBL will accept what you’ve done. I suspect it will not. I suppose a work-around for the footnotes would be to Deny all of the “Unexpected book…” errors, and add a changes.txt file to the project folder where you allow Paratext/PubAssist to convert the footnote references back to the standard format for printing by removing ayat.

by (192 points)

I was hoping to avoid Denying hundreds of errors. However it just dawned on me that you can do that in bulk–selecting multiple errors with Shift or Ctrl and then denying them. So maybe that would just be what we end up doing.

We’re not actually (currently) planning on using the \fr data at all, so theoretically it could be stripped out altogether. I’ve never really understood the purpose of that field anyway since you could presumably recreate it on the fly with a simple programming script. But if we were to ever submit to the DBL or otherwise use the \fr field, we would almost certainly use changes.txt or the like to edit the formatting.

Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Finally, all of you, be like-minded, be sympathetic, love one another, be compassionate and humble.
1 Peter 3:8
2,645 questions
5,394 answers
5,065 comments
1,437 users