Quotes in Arabic script

Question

I have an Arabic script project which has 3 levels of quotes:
« ‹ “ ” › »
These have the following Unicode code points:
\u00AB \u2039 \u201C \u201D \u203A \u00BB

The top two levels reverse themselves automatically in my RTL project in Paratext. So for example, I use \u00AB to open a quote, but it looks like this on the screen: ». This is generally normal behavior in RTL, and I’m fine with that.

But the third level quotes are not reversing in the same way. In order to get this in Paratext:

I need to use \u201D on the right (the “opening” quote) and \u201C on the left (the “closing” quote).

It is true that in the Unicode standard U+201C is LEFT DOUBLE QUOTATION MARK, and U+201D is RIGHT, etc. But the guillemets are defined in the same way (using RIGHT- and LEFT-POINTING), but they reverse.

However, it appears that my XeTeX publishing process DOES reverse them, so if they appear properly in Paratext, they get reversed in the PDF output. That’s not a problem - I have a processing step in between them, and I can swap the characters if I need to.

BUT MY QUESTION is what is the CORRECT form to have in Paratext? The Scripture text data will be used in YouVersion and elsewhere, and it would be helpful if those quotes are turned around the right way in those outputs as well. Should I assume that if it appears correctly in Paratext, that it will appear correctly in YouVersion? Or should I use U+201C as the opening quote, even though it doesn’t appear properly in Paratext, because down-the-road outputs will do the right thing?

Paratext Jul 29, 2019 asked by jeffh (1.4k points)
Jul 29, 2019 reshown

5 Answers

jeffh · Answer 1 · 2019-07-29T17:12:15+0000

Interestingly enough, using Times New Roman font in LibreOffice Writer, with these quotes:

if I turn it into a RTL paragraph, it of course jumps to the right side of the page, but this is what I get for the quotes:

So does that mean that U+201C and U+201D are not intended to automatically swap like other quote pairs? That I should just use the desired surface form?

But that doesn’t align with what I saw in XeTeX, where it did actually reverse the surface forms I had put into Paratext, and made them backwards.

Some more food for thought…

jeffh · Answer 2 · 2019-07-29T19:04:50+0000

OK, so you are saying that from a data perspective, if we use those characters for quotes, you think we should use U+201D to open and U+201C to close these RTL quotes, correct? That certainly makes them appear correctly in Paratext. Do you think that YouVersion and other folks that might eventually use this text data will also make them appear correctly in their outputs?

Even if XeTeX is wrong in the way it handles these characters, I can easily pre-process my texts to modify the them and make the typesetting come out the right way. So there is no problem there. But I wanted to make sure that the text data that goes into the DBL from Paratext is as “correct” as possible.

jeffh · Answer 3 · 2019-07-29T20:07:08+0000

(Sheepish reply) OK, so XeTeX isn’t handling the quotes incorrectly. It turns out that in my XeTeX setup file I have code (written a while back that I forgot was in there) that swaps the left and right quotes. So I’m pretty sure that if fix up the XeTeX setup file, I should be all set.

I think, however, that it was still useful to have has this discussion, as it has made some things more clear in my mind. Thanks.

Quotes in Arabic script

Please log in or register to answer this question.

5 Answers

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Please log in or register to add a comment.

Related questions

Categories