0 votes

One of our uses pointed out that the Hyphenation markers in the ThSV_11 resource are shown in all views. In Paratext 7.5 we could hide those with some CCT files, but we can not drop those into the .p8z file in Paratext 8.

What can we do to to hide these?

Paratext by (476 points)

3 Answers

0 votes
Best answer

I did not see this question until just now.

In P8 this is controlled by “Project > Language Settings > Other Characters > Word break characters”.

If you have a project that uses /'s and ^'s as places where lines can break without a hyphen and ='s as places where lines can break with a hyphen you need to enter the following in this field:

=(00AD) /(200B) ^(200B)

If you do this, when you switch to Preview view you will no longer see = / ^ but instead see ZWSP (which are invisible but allow line breaks) and hyphens for = (if the line breaks at that point).

Since the default replacement is ZWSP this could also be entered as

=(00AD) / ^

We have some issues that need to be fixed

  • If you enter characters in this field with the wrong format, on my machine at least, you do not get an error message, it just disables the OK button.

  • The guide does not mention that you need parenthesis or how to specify multiple characters.

I will write a bug report for this.

by (646 points)
reshown

With regard to hyphenation. The Thai projects that I am aware of do not use hyphenatedwords.txt. They mark up the hyphenation for each word of the text individually. It would be a good thing if we could figure out how to get past that but we have not worked on it yet. The Thai texts that I am aware of also mark subdivisions within a word

/ word boundary
^ a place where a word can break without a hyphen
= a place where a word can break with a discretionary hyphen

We mark places where a word can break because Thai is difficult to justify since there are not many spaces to expand and expanding individual letters or letter spacing can only be done in small amounts or the text starts to look loose and ugly.

It would also be good if we could figure out how to standardize these ^'s in the hyphenatedwords.txt file since it is a major time chunk to put them all in manually.

Thank you anon451647. So to repeat what I understand you to be saying. If we want to assist the copyright holder of THSV_11 in cleaning up the project. We would advise them to add = (00AD) / ^ to the Word break Characters section of ‘Other Characters’, then save the project and submit the revision to the DBL.

I tried to edit this post, but couldn’t find the option and then opted to delete the post and then create a new one.

yes, that is the change that would allow you to view the text in w/o extra decorations in Preview mode

0 votes

Unfortunately, it is now up to the resource creators (or someone they designate) to fix their resources. :worried:

by [Expert]
(16.2k points)

So the resources will now have to be updated to match the current scheme for hyphenation. Is there a write-up on the current scheme? I imagine that we'll be the ones to get this updated.

0 votes

The typical scheme for hyphenation is that nothing is marked directly in Paratext. A hyphenatedWords.txt file is included in the project that has the possible hyphenation points marked with the equal signs.

by (8.4k points)

For information purposes, this file can be "edited" by using the Wordlist (View > Show Hyphenation) inside Paratext which may or may not be easier than editing the file directly.

I should clarify that this resource is in Thai Script using the standard Thai convention of not using spaces to mark word boundaries. In 7.5 and prior versions, the / would be used to mark word boundaries. You could see these / in the unformatted view, but the formatted views would have the / presented as a zero width space.

So Part 1 of the question was, or ought to have been, really how do we handle the = in this resource and the answer is that the resource owner or their designee will have to go through and remove the = from the text itself and then work with the hyphenation tool…and then resubmitting it to the DBL.

Part 2 of the question is, how do you handle word boundaries for languages that do not use spaces for the word boundary? Is there another place somewhere that instructs Paratext 8 to present the / as a Zero Width Space?

I can use the find replace to replace / with 200b (and then using alt-x switch 200b to the zero width space). That would work for this resource as it’s not a live project, but ongoing projects are going to have problems typing 200b and hitting alt-x for every word break as there are no keyboards (that I am aware of) which have a Zero Width Space as a valid keystroke.

So question 3 would then be how does one denote where a zero width space should go in Paratext 8. Perhaps I should turn this into it’s own question.

There are a couple of options here.
1) If a team needs to "see" the zero width space then you would continue to use / and simply change it for publication (the output process can change this).
2) You could add a line in the autocorrect.txt file that says something like:
/-->\u200b

Then if the user types a / it simply autocorrects this to a zero width space.

In Paratext 7.5 we had the ability to, with cct files, use the / to mark a Zero Width Space and have the / appear in the unformatted view but be hidden in the preview view. Does Paratext 8 not have that sort of capability…to show / in the unformatted view, but replace / with the Zero Width Space in the preview or formatted view?

Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Accept the one whose faith is weak, without quarreling over disputable matters.
Romans 14:1
2,648 questions
5,397 answers
5,069 comments
1,451 users