+2 votes

Hi all, I’d like to discuss non-breaking (no-break) spaces (U+00A0) in the age of digital publishing.

I work in a country that is officially bilingual between French and English. In English-speaking countries, non-breaking and half spaces are rare. You might find a non-breaking space in between the parts of 1·Chronicles.

In contrast, most of our translators from the French-speaking regions want to follow French spacing rules. In the French-speaking world, spaces are required before/around punctuation with 2 marks (; ! : ? « »). No-break spaces are often used as a thousands separator in large numbers, like 1·000 (Number Settings - Thousands Separator gets reset - #3 by jeffh). Publishing standards vary, but the ideal candidates for this case are the common no-break space (U+00A0) and the narrow no-break space (U202F). This prevents a closing punctuation mark from getting bumped to the next line, especially in multiple-column layouts. These spaces keep text flowing across the world’s computers. The standard space (U+0020), thin space (U+2009), and hair space (U+022A) have varying widths, but don’t provide the necessary orphan protection.

Some tools choose to represent no-break spaces visually as a faint dot or a grey box. Paratext automatically replaces no-break space (U+00A0) with a ginormous tilde (~, U+007E). In current versions, entering 00A0 and pressing ALT-X replaces what should be a no-break space with simple (0020) spaces, which is outright wrong. This giant tilde is very distracting to our translators, and so most of the time, francophone translators have been instructed to either use simple spaces or not to use spaces before punctuation, with the understanding that the typesetter will standardize the spaces at the last minute following their decisions before printing. I am fully aware that the ginormous tildes are replaced with no-break spaces in preview mode, and can be replaced via Print Draft changes or ptxPrint. A creative techie could configure PTX print to insert spaces before such characters, but then they wouldn’t be seen in Paratext at all.

From Paratext Documentation:

Because of this, Paratext no longer supports the use of no-break space. If you want to use no-break spaces in your text, you should enter tilde in place of the no-break space. These can be converted to no-break spaces at typesetting. Tildes entered in the text are displayed as no-break spaces in the Preview view.

If this was something rare (as it is in English), I wouldn’t be concerned, but it comes up nearly every paragraph of a francophone text. My concern is that 1) the tildes are distracting for the translators who spend their time in standard view, and for those reading over their shoulders and 2) that this restriction doesn’t hold up in this age of digital publishing and sharing.

Technically,~one~can~get~by~and~publish~via~the~intervening~tildes. The field staff’s response to the tildes of ignoring the no-break spaces until typesetting is no longer viable, as the texts are not only prepared for printing by typesetters. Individual books are published locally, digital versions are put into the DBL and made available, and Scripture Apps are made. This means that having “final” spacing in the Paratext version is increasingly important, and Paratext still doesn’t “really” support this. (I just discovered today that narrow no-break space (U202F) has no visual indicator, but thankfully doesn’t get tildefied.) Inconsistent display/handling of U+202F, which should be narrower, is discussed here (Deletion of Unicode 202F ("narrow no break space") in project)

LibreOffice uses a grey box to distinguish no-break spaces from normal spaces. Word’s Show/Hide feature uses an open circle for non breaking spaces and a centered dot for normal spaces.

Would Paratext devs consider scrapping the tilde in the interface and using something more legible and less distracting? Would this require a change to USFM, or just Paratext? Translators are already used to greyish metadata in their text like USFM markers. A faint grey dot · in standard view would be helpful to distinguish special spaces and normal spaces, right? Going with the grey squares instead would show both non-breakingness and length, which I suppose is why LibreOffice chose them. This would be a big win for cross-cultural compatibility, I hope someone else working in the francophone world can weigh in. @jeffh @dhigby @anon023887 ?

I understand from this post (Non-breaking space issues) that the tildefication was introduced to combat an Internet Explorer alternation problem. Even now, two consecutive spaces in a webpage require that at least one is transformed to NBSP. Yes, it is hard to visually distinguish spaces, but teams still have to standardize them.
~ Matthew_Lee
Language Technology Consultant
SIL Cameroon

To make things worse, the Paratext help treats no-break spaces as a scourge to be eradicated (see below).

Why do I see tildes in place of spaces in my text?
No-break spaces are characters that look like a space but which do not allo…
No-break spaces are characters that look like a space but which do not allow a line to break at that location. On opening a text with no-break spaces, if you find that those spaces appear to have been replaced with tilde characters (~), this is deliberate. Paratext makes no-break spaces visible by displaying them as tildes.
What is the least I need to know about this?
Earlier versions of Paratext occasionally and wrongly inserted no-break spaces where a normal space was needed. Therefore, if you see an occasional tilde in a place in the text which you are quite sure does not require a no-break space, you can just replace the tilde with a space.
What if I would like to take care of the problem all at once?
If you have NOT knowingly inserted any no-break spaces or tildes in your text and therefore want to remove all the no-break spaces and tildes, follow the instructions in Option 1. If you are not absolutely sure whether or not the tildes or no-break spaces were intentionally inserted, check with your CAP support person before removing all of them, otherwise you may have to manually re-enter them.

             Option 1 (To get rid of all no-break spaces and tildes):
             
                Click the tab of your project to make it the active tab.
                From the Tools menu, point to Advanced and then select Replace No-Break Spaces With Normal Spaces.
                Read the warning message and click Yes if you are sure you wish to continue.

        If your project has been following the USFM manual and so has been manually inserting tildes either to represent no-break spaces or for some other function, follow the instructions in Option 2. Doing this sooner rather than later prevents Paratext from inserting any more occasional unwanted tildes.

             Option 2 (To get rid of no-break spaces, but keep all tildes):
             
                Click the tab of your project to make it the active tab.
                From the Tools menu, point to Advanced and then select Replace No-Break Spaces With Normal Spaces But Keep Tildes.
                Read the warning message and click Yes if you are sure you wish to continue.
        
         See also:
        
          Important information about no-break spaces and tildes
Paratext by (231 points)
reshown

11 Answers

0 votes

And one more about the “feature” that PT removes double spaces always and without any mercy:

I hate that. When editing, splicing texts, copy-pasting, and in many other situations I often have two spaces for a moment and that is good, because I am in the proces of doing something. Then PT murders one innocent space. And I have to enter it again. So frustrating.

It even bothers the machine itself: because without the space (that was removed against my will) there will be of course a spelling-error when two words get fused artificially, and PT has to underline in red squigglies; and all this just heats up my office, literally.

I suffer this annoyance not once per year but hundreds of times. I love PT, have said it many times, but this aspect I deeply hate. It does not even have a turn-off-option.

What is the danger of a double-space? Will it burn down a village? Can it hurt children or live-stock or pets?

Cleaning up unwanted double-spaces is such a basic and simple feature in any typesetting workflow, that there is no reason to even have this brutal-robot exist in PT. It should be removed entirely and not be complicated with options. Above is a user who has described a legit use of double-space, and they have to use a work-around because somebody invented an automatic “helper”.

This is my personal opinion, and still relevant here, because it is coming from regular production work in PT and from a deep frustration of this automatic-non-respect of what I am doing during my edits. I seem to remember that I have written requests about this in the past.

And to link back to this thread above: I have a good selection of tools on my machine here. I have never encountered anything similar at all. This does not feel normal to me.

(Except when I have a fresh install of MS Office, then as a first step I spend half an hour to search and opt-out of anything automatic, because those are never helpful for my job description and my work, constantly changing between four languages. This double-space killer might exist in MS Office, but I would not know, because Office allows users to opt-out.)

hth

by (842 points)
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
If anyone destroys God’s temple, God will destroy that person; for God’s temple is sacred, and you together are that temple.
1 Corinthians 3:17
2,564 questions
5,294 answers
5,000 comments
1,374 users