0 votes

I’m having a problem with text presentation of a resource in a Text Collection. To see this yourself, download the resource ShuLatn, language Arabic (shu-Latn), and add it to a Text Collection window. Click on the blue ShuLatn link in the list of texts, and the expanded text that appears to the right no longer has hyphens:

All of these are normal U+002D hyphen characters. You can see that it hasn’t just completely removed the hyphens, since for example al-makruubiin is still broken across a line. But the hyphen is missing.

I see in the language properties that hyphen is listed as a word-breaking character (instead of maybe punctuation inside a word), but if that were the only problem, then at least when a word is broken across a line that hyphen should appear.

I don’t have direct access to this project (which is in the DBL), but if the solution requires a change to the project, I believe we could get it changed.

Paratext by (1.3k points)
reshown

3 Answers

0 votes
Best answer

Actually, now that I read the word break definition in the Guide, I think this behavior makes sense:
“If the script of your project does not normally indicate word breaks with a space, enter the character here that you will use in your project text to indicate word breaks. Immediately following the character in the box, enter (00AD) – include the parentheses – if you want a hyphen to appear when a long line wraps.”

So I think I should remove the hyphen from the word break characters field. And my guess is that if I add it to the punctuation inside a word field, it would consider as one word “l-iid” and “al-waali”. It might be better just to not put it anywhere, and I imagine then it would just be treated as punctuation. (It’s not in the alphabetical characters list.) Does this sound like an accurate analysis?

by (1.3k points)

jeffh - if you want the hyphen to always appear in the word then you need to add it in the box for Word Medial Punctuation. If you don’t include it in Word Medial Punctuation then you will not see these words in the word list.

0 votes

In this language, the hyphen basically separates two words, but that’s the way they are written in the orthography. E.g. “al-raajil” means “the man”. So having “al” as a separate word from “raajil” in the word list shouldn’t really be a problem.

by (1.3k points)
0 votes

To my knowledge the only orthographies that need these characters are in SE Asia (Thai, Lao Burmese and Khmer scripts). Perhaps there should be a warning in the language settings that if a character is put in the Word Break Character field, it will not be visible in Preview mode or in typesets.

by (1.8k points)
reshown

Related questions

0 votes
2 answers
Paratext Aug 5, 2022 asked by bit (443 points)
0 votes
0 answers
Paratext Mar 31, 2021 asked by [Moderator]
james_post
(2.0k points)
0 votes
5 answers
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
For just as each of us has one body with many members, and these members do not all have the same function, so in Christ we, though many, form one body, and each member belongs to all the others.
Romans 12:4-5
2,628 questions
5,370 answers
5,045 comments
1,420 users