Hi all, I’d like to discuss non-breaking (no-break) spaces (U+00A0) in the age of digital publishing.
I work in a country that is officially bilingual between French and English. In English-speaking countries, non-breaking and half spaces are rare. You might find a non-breaking space in between the parts of 1·Chronicles.
In contrast, most of our translators from the French-speaking regions want to follow French spacing rules. In the French-speaking world, spaces are required before/around punctuation with 2 marks (; ! : ? « »). No-break spaces are often used as a thousands separator in large numbers, like 1·000 (Number Settings - Thousands Separator gets reset - #3 by jeffh). Publishing standards vary, but the ideal candidates for this case are the common no-break space (U+00A0) and the narrow no-break space (U202F). This prevents a closing punctuation mark from getting bumped to the next line, especially in multiple-column layouts. These spaces keep text flowing across the world’s computers. The standard space (U+0020), thin space (U+2009), and hair space (U+022A) have varying widths, but don’t provide the necessary orphan protection.
Some tools choose to represent no-break spaces visually as a faint dot or a grey box. Paratext automatically replaces no-break space (U+00A0) with a ginormous tilde (~, U+007E). In current versions, entering 00A0 and pressing ALT-X replaces what should be a no-break space with simple (0020) spaces, which is outright wrong. This giant tilde is very distracting to our translators, and so most of the time, francophone translators have been instructed to either use simple spaces or not to use spaces before punctuation, with the understanding that the typesetter will standardize the spaces at the last minute following their decisions before printing. I am fully aware that the ginormous tildes are replaced with no-break spaces in preview mode, and can be replaced via Print Draft changes or ptxPrint. A creative techie could configure PTX print to insert spaces before such characters, but then they wouldn’t be seen in Paratext at all.
From Paratext Documentation:
Because of this, Paratext no longer supports the use of no-break space. If you want to use no-break spaces in your text, you should enter tilde in place of the no-break space. These can be converted to no-break spaces at typesetting. Tildes entered in the text are displayed as no-break spaces in the Preview view.
If this was something rare (as it is in English), I wouldn’t be concerned, but it comes up nearly every paragraph of a francophone text. My concern is that 1) the tildes are distracting for the translators who spend their time in standard view, and for those reading over their shoulders and 2) that this restriction doesn’t hold up in this age of digital publishing and sharing.
Technically,~one~can~get~by~and~publish~via~the~intervening~tildes. The field staff’s response to the tildes of ignoring the no-break spaces until typesetting is no longer viable, as the texts are not only prepared for printing by typesetters. Individual books are published locally, digital versions are put into the DBL and made available, and Scripture Apps are made. This means that having “final” spacing in the Paratext version is increasingly important, and Paratext still doesn’t “really” support this. (I just discovered today that narrow no-break space (U202F) has no visual indicator, but thankfully doesn’t get tildefied.) Inconsistent display/handling of U+202F, which should be narrower, is discussed here (Deletion of Unicode 202F ("narrow no break space") in project)
LibreOffice uses a grey box to distinguish no-break spaces from normal spaces. Word’s Show/Hide feature uses an open circle for non breaking spaces and a centered dot for normal spaces.
Would Paratext devs consider scrapping the tilde in the interface and using something more legible and less distracting? Would this require a change to USFM, or just Paratext? Translators are already used to greyish metadata in their text like USFM markers. A faint grey dot · in standard view would be helpful to distinguish special spaces and normal spaces, right? Going with the grey squares instead would show both non-breakingness and length, which I suppose is why LibreOffice chose them. This would be a big win for cross-cultural compatibility, I hope someone else working in the francophone world can weigh in. @jeffh @dhigby @anon023887 ?
I understand from this post (Non-breaking space issues) that the tildefication was introduced to combat an Internet Explorer alternation problem. Even now, two consecutive spaces in a webpage require that at least one is transformed to NBSP. Yes, it is hard to visually distinguish spaces, but teams still have to standardize them.
~ Matthew_Lee
Language Technology Consultant
SIL Cameroon
To make things worse, the Paratext help treats no-break spaces as a scourge to be eradicated (see below).
Why do I see tildes in place of spaces in my text?
No-break spaces are characters that look like a space but which do not allo…
No-break spaces are characters that look like a space but which do not allow a line to break at that location. On opening a text with no-break spaces, if you find that those spaces appear to have been replaced with tilde characters (~), this is deliberate. Paratext makes no-break spaces visible by displaying them as tildes.
What is the least I need to know about this?
Earlier versions of Paratext occasionally and wrongly inserted no-break spaces where a normal space was needed. Therefore, if you see an occasional tilde in a place in the text which you are quite sure does not require a no-break space, you can just replace the tilde with a space.
What if I would like to take care of the problem all at once?
If you have NOT knowingly inserted any no-break spaces or tildes in your text and therefore want to remove all the no-break spaces and tildes, follow the instructions in Option 1. If you are not absolutely sure whether or not the tildes or no-break spaces were intentionally inserted, check with your CAP support person before removing all of them, otherwise you may have to manually re-enter them.
Option 1 (To get rid of all no-break spaces and tildes):
Click the tab of your project to make it the active tab.
From the Tools menu, point to Advanced and then select Replace No-Break Spaces With Normal Spaces.
Read the warning message and click Yes if you are sure you wish to continue.
If your project has been following the USFM manual and so has been manually inserting tildes either to represent no-break spaces or for some other function, follow the instructions in Option 2. Doing this sooner rather than later prevents Paratext from inserting any more occasional unwanted tildes.
Option 2 (To get rid of no-break spaces, but keep all tildes):
Click the tab of your project to make it the active tab.
From the Tools menu, point to Advanced and then select Replace No-Break Spaces With Normal Spaces But Keep Tildes.
Read the warning message and click Yes if you are sure you wish to continue.
See also:
Important information about no-break spaces and tildes