+2 votes

Hi all, I’d like to discuss non-breaking (no-break) spaces (U+00A0) in the age of digital publishing.

I work in a country that is officially bilingual between French and English. In English-speaking countries, non-breaking and half spaces are rare. You might find a non-breaking space in between the parts of 1·Chronicles.

In contrast, most of our translators from the French-speaking regions want to follow French spacing rules. In the French-speaking world, spaces are required before/around punctuation with 2 marks (; ! : ? « »). No-break spaces are often used as a thousands separator in large numbers, like 1·000 (Number Settings - Thousands Separator gets reset - #3 by jeffh). Publishing standards vary, but the ideal candidates for this case are the common no-break space (U+00A0) and the narrow no-break space (U202F). This prevents a closing punctuation mark from getting bumped to the next line, especially in multiple-column layouts. These spaces keep text flowing across the world’s computers. The standard space (U+0020), thin space (U+2009), and hair space (U+022A) have varying widths, but don’t provide the necessary orphan protection.

Some tools choose to represent no-break spaces visually as a faint dot or a grey box. Paratext automatically replaces no-break space (U+00A0) with a ginormous tilde (~, U+007E). In current versions, entering 00A0 and pressing ALT-X replaces what should be a no-break space with simple (0020) spaces, which is outright wrong. This giant tilde is very distracting to our translators, and so most of the time, francophone translators have been instructed to either use simple spaces or not to use spaces before punctuation, with the understanding that the typesetter will standardize the spaces at the last minute following their decisions before printing. I am fully aware that the ginormous tildes are replaced with no-break spaces in preview mode, and can be replaced via Print Draft changes or ptxPrint. A creative techie could configure PTX print to insert spaces before such characters, but then they wouldn’t be seen in Paratext at all.

From Paratext Documentation:

Because of this, Paratext no longer supports the use of no-break space. If you want to use no-break spaces in your text, you should enter tilde in place of the no-break space. These can be converted to no-break spaces at typesetting. Tildes entered in the text are displayed as no-break spaces in the Preview view.

If this was something rare (as it is in English), I wouldn’t be concerned, but it comes up nearly every paragraph of a francophone text. My concern is that 1) the tildes are distracting for the translators who spend their time in standard view, and for those reading over their shoulders and 2) that this restriction doesn’t hold up in this age of digital publishing and sharing.

Technically,~one~can~get~by~and~publish~via~the~intervening~tildes. The field staff’s response to the tildes of ignoring the no-break spaces until typesetting is no longer viable, as the texts are not only prepared for printing by typesetters. Individual books are published locally, digital versions are put into the DBL and made available, and Scripture Apps are made. This means that having “final” spacing in the Paratext version is increasingly important, and Paratext still doesn’t “really” support this. (I just discovered today that narrow no-break space (U202F) has no visual indicator, but thankfully doesn’t get tildefied.) Inconsistent display/handling of U+202F, which should be narrower, is discussed here (Deletion of Unicode 202F ("narrow no break space") in project)

LibreOffice uses a grey box to distinguish no-break spaces from normal spaces. Word’s Show/Hide feature uses an open circle for non breaking spaces and a centered dot for normal spaces.

Would Paratext devs consider scrapping the tilde in the interface and using something more legible and less distracting? Would this require a change to USFM, or just Paratext? Translators are already used to greyish metadata in their text like USFM markers. A faint grey dot · in standard view would be helpful to distinguish special spaces and normal spaces, right? Going with the grey squares instead would show both non-breakingness and length, which I suppose is why LibreOffice chose them. This would be a big win for cross-cultural compatibility, I hope someone else working in the francophone world can weigh in. @jeffh @dhigby @anon023887 ?

I understand from this post (Non-breaking space issues) that the tildefication was introduced to combat an Internet Explorer alternation problem. Even now, two consecutive spaces in a webpage require that at least one is transformed to NBSP. Yes, it is hard to visually distinguish spaces, but teams still have to standardize them.
~ Matthew_Lee
Language Technology Consultant
SIL Cameroon

To make things worse, the Paratext help treats no-break spaces as a scourge to be eradicated (see below).

Why do I see tildes in place of spaces in my text?
No-break spaces are characters that look like a space but which do not allo…
No-break spaces are characters that look like a space but which do not allow a line to break at that location. On opening a text with no-break spaces, if you find that those spaces appear to have been replaced with tilde characters (~), this is deliberate. Paratext makes no-break spaces visible by displaying them as tildes.
What is the least I need to know about this?
Earlier versions of Paratext occasionally and wrongly inserted no-break spaces where a normal space was needed. Therefore, if you see an occasional tilde in a place in the text which you are quite sure does not require a no-break space, you can just replace the tilde with a space.
What if I would like to take care of the problem all at once?
If you have NOT knowingly inserted any no-break spaces or tildes in your text and therefore want to remove all the no-break spaces and tildes, follow the instructions in Option 1. If you are not absolutely sure whether or not the tildes or no-break spaces were intentionally inserted, check with your CAP support person before removing all of them, otherwise you may have to manually re-enter them.

             Option 1 (To get rid of all no-break spaces and tildes):
             
                Click the tab of your project to make it the active tab.
                From the Tools menu, point to Advanced and then select Replace No-Break Spaces With Normal Spaces.
                Read the warning message and click Yes if you are sure you wish to continue.

        If your project has been following the USFM manual and so has been manually inserting tildes either to represent no-break spaces or for some other function, follow the instructions in Option 2. Doing this sooner rather than later prevents Paratext from inserting any more occasional unwanted tildes.

             Option 2 (To get rid of no-break spaces, but keep all tildes):
             
                Click the tab of your project to make it the active tab.
                From the Tools menu, point to Advanced and then select Replace No-Break Spaces With Normal Spaces But Keep Tildes.
                Read the warning message and click Yes if you are sure you wish to continue.
        
         See also:
        
          Important information about no-break spaces and tildes
Paratext by (231 points)
reshown

11 Answers

+1 vote
Best answer

Yes, I agree with Matthew_Lee that this is an important issue, especially in the French-speaking world.There are a number of things I want to mention in my analysis, but I will try to summarize (TLDR) at the bottom of this post.

A little online research shows some interesting things, which are not really “beside the point”:

image

And some humorous failures (where they obviously were using a normal space - that broke in this case):

image

That’s exactly what we want to avoid - bits of punctuation not connected to its associated text. So if we are going to use a some sort of space character to set off punctuation, we must ALWAYS use a non-breaking space of some sort.

The main two options are the full No-Break Space (NBSP, U+00A0), or the Narrow No-Break Space (NNBSP, U+202F), whose definitions can be found in the Unicode standard, at https://unicode.org/charts/PDF/U0090.pdf and https://unicode.org/charts/PDF/U2000.pdf, respectively:

image
image

As you can see, the definition of the NNBSP says it is typically the width of a thin space, which is defined in that same chart as :

image

So a NNBSP would typically be a fifth of an em (0.2em). How big is a normal space or a NBSP? These metrics are font-dependent, but a rough calculation with the Charis SIL font shows that the space and NBSP characters are about 0.34em. The NNBSP is about 0.22em. This is a significant difference, and if you use a NBSP (or as a temporary measure a regular space, which has the disadvantage of breaking across lines) around punctuation, the typesetters I know will say that that space is too large. Using the NNBSP helps significantly, and can be done fairly easily in PTXprint with changes like the following lines in PrintDraftChanges.txt:

' *:'  >    '\u202f:'     # Place non-breaking thin space before colon
'« *' >    '«\u202f'     # Place non-breaking thin space after opening guillemets
' *»' >    '\u202f»'     # Place non-breaking thin space before closing guillemets
'‹ *' >    '‹\u202f'     # Place non-breaking thin space after opening guillemets
' *›' >    '\u202f›'     # Place non-breaking thin space before closing guillemets

This puts a NNBSP before or after (as necessary) the punctuation, and also removes any spaces that are there (if any). That means that whether or not the team puts in spaces, they will be normalized to NNBSP characters. E.g. in this project the team is inconsistent and uses (regular) spaces around question marks and colons, but not around quote marks (guillemets):

Note that you can see that these are just regular spaces if you adjust the zoom and/or pane size just right, as they will allow a break across a line, like this:

But the changes above should be able to handle both those cases OK, and insert the NNBSP for the typesetting.

In a similar way, you would want to put change rules in your SAB projects, to make sure that your Scripture apps handle the spaces properly. Check out this post for sample rules: https://community.scripture.software.sil.org/t/suggestions-for-changes-gallery/590/3.

Note that the rules in this post do not handle the space or no-space as elegantly as the rules above, but you can adjust them with tricks like the " *" used above.

And one further point before we get to Paratext… In recent typesetting jobs we have actually used one tenth of an em (0.1em) as space around the punctuation, i.e. smaller than NNBSP. Here is the punctuation definition we used:

\catcode`\:=\active \def:{\unskip\kern0.1em\char`\:{}} % colon

Note: this was done in XeTeX, but the same could be done with PTXprint. I believe you would want it defined in the ptxprint-mods.tex configuration file available on the Advanced tool tab. This gives a fairly minimal space around the punctuation, as seen in this sample:

But the teams have felt that that is sufficient space to meet their felt need of space around the punctuation that is required in French. (Of course, the French may disagree, but it’s not their language!)

Conclusion (TLDR): So what does this mean for Paratext?

If the team uses regular spaces in the text to offset their punctuation, then sometimes it will appear incorrectly on their screen in Paratext (i.e. with punctuation not properly attached to its text, as shown above), which is distracting but not the end of the world. In this case, the onus is on the typesetter or app builder to change those regular spaces appropriately. Unfortunately, if this is the form that is put into the DBL (highly likely), then apps like YouVersion are going to have problems, because they notoriously DON’T handle those spaces appropriately.

Given this tendency to smaller and smaller no-break spaces to set off the punctuation (first the NNBSP at 0.2em, then manually typesetting at 0.1em with PTXprint) that I’ve seen in my typesetting projects, I almost always recommend that teams put NO spaces around their punctuation in Paratext, and then just trust the typesetting or app building to do the right thing around those punctuation marks. This means that when the text is put into the DBL, YouVersion is not going to have any hanging puctuation. (It won’t have spaces around the punctuation either, but that is a lesser problem IMHO.)

So with this specific plan of action, no changes are required in Paratext. If you wanted, as Matthew_Lee suggested, a way to show NNBSP or NBSP characters, I think that would be a good idea, but our keyboards would also need a way to type those characters (which they don’t always), and Paratext would need to know not to mess with those characters. (And punctuation inventories would need to show all of the combinations with those spaces, to make sure that they were being used consistently, e.g. always with a NNBSP.)

This post isn’t so much proposing solutions as providing more background and information. I really don’t like the way this tilde / NBSP stuff works in Paratext now, and agree that it should change. It seems like Paratext should assume that it should take every character in the text at face value, whether it is a tilde, NBSP or NNBSP. And a way to see them (subtly) would be nice. Should two or more spaces be automatically combined (responding to @anon942452)? Maybe if they are identical characters? That would still allow the automated Paratext spacing fix, but also provide some options for getting around it. And one would also need to come up with a way to deal with all of the legacy projects that have tildes for non-breaking spaces, maybe just a conversion, to convert them all to NBSP, once that’s handled properly in Paratext.

Anyway, some more food for thought…

by (1.3k points)

Some national languages in Southeast Asia use a space between phrases and not between words. Some minority languages using these scripts have chosen to use a normal space between each word and a wider space at phrase breaks. If an EM SPACE (\u2003) is used for these wide-space phrase breaks in Paratext, then the Character and Punctuation Inventories treat the EM SPACE as a word forming character rather than punctuation, making it impossible to check for correct sequences. I’ve reported this as PTXS-31753.

When jeffh recommends that translators leave the French spaces out of Paratext entirely he is helping to ensure that the text in Paratext unambiguously marks the structure/meaning at the cost of an uglier presentation. I do the same when I recommend the use of comma instead of EM SPACE within Paratext. But users are right to push back against both suggestions; I too greatly prefer the WYSIWYG in Microsoft Word to the WordStar dot commands I used on my first computer.

I’d love to see Paratext add a “Show Invisible Characters” option which would, e.g., show space characters as a grey box. This “Show Invisible Characters” would be immensely beneficial to the languages which type a Zero Width Space (\u0200b) between each word. Currently Paratext recommends that a slash (/) be typed between each word. This slash is then (hideously) visible in all but the Preview views.

I wonder if Francophone areas could find a font which either (1) makes ~ much less obtrusive or (2) automatically adjust space around punctuation according to context.

Blessings,
LivingField

It can sometimes be a disservice for Paratext to accommodate customization requests. This is especially true when the customization is not supported in other software or when it allows choices that go against the direction the commercial software is going. Language communities then make choices that are dead ends for their future development outside of Paratext. (Of course it has also been very helpful in other situations, the issues are just complex and need careful thought.)

But in this conversation, we are talking about accommodation for choices currently available in commercial software as well as a tool already commonly available. If presented this way in a feature request, I think we could get somewhere. Perhaps those most knowledgeable about it could have a separate conversation off-list about the best way forward to present the feature request and what is actually most needed.

Blessings,

0 votes

Thank you Matthew_Lee, for your helpful and informative email. I don’t have any solutions to offer but am very interested in the topic and learned more from your thoroughness. I struggle especially with the RTL and LTR invisible characters which impact our complex scripts. If there were a way to make all the invisible characters slightly visible (or turn visibility on and off with a ctrl key perhaps), that could help to more easily solve some sticky wickets we find ourselves in. That approach might also make the normal non-breakable space viable for use in Paratext though I have no idea of the other possible barriers to its use on the progamming side.

Blessings,

by (1.3k points)
reshown
+1 vote

Dear Matthew_Lee, thanks for bringing this up. I work in North American First Nations languages that use a non-roman script (Canadian Syllabics) and several major orthographies that use this script make use of various widths (three widths) of white space to indicate morpheme and word boundaries. The normal word space 0020 is considerably wider in the preferred syllabic font, which behaves fine in Paratext. But the narrow-no-break-space (U+202F) is 1/3 the width of the normal word space. The use of this space is critical in our language as well–it needs to be used within words as a morpheme boundary and also (as is done in French) to separate the punctuation from the end of sentences. Finally, in many situations a third width of non-breaking space is necessary. For years the language communities that we have worked with have used two narrow no-break-spaces (U+202F U+202F) in sequence to provide a non-breaking space between prefixes and word stems, preventing orphans at the end of lines and providing a visual cue as to the start of the stem. The width of these works out to 2/3 a standard word space.

Unfortunately, since Paratext 7, there is a paratext algorithm that deletes any two identical white space characters in sequence and replaces them with one. We had to come up with a kludgy work-around by creating a keyman keyboard that inserts a zero-width non-breaking space (U+200D) to prevent Paratext from replacing our intentional double-thin-space with (U+202F U+202F) with just one.

When exporting our scripture to the DBL, this sequence was unacceptable, so we must run a conversion program first that replaces all the sequences (U+202F U+200D U+202F) with a “standard” non-breaking space (in Paratext, “tilde”, which becomes U+00A0).

Anyway, all that to say that I support your topic of revisiting the tilde as non-breaking space, because of the reasons you give, and I wanted to weigh in with a script that uses three distinct widths of white space.

Sincerely, anon942452 J

by (106 points)

Modern web technologies tend to compress duplicate spaces without asking, but USUALLY allow alternation between multiple types of spacing. For this reason, web designers have long abused spacing by alternating space and nbsp instead of setting indents. @anon942452 has found a method that works to override the automated garbage cleanup in the same way.

If I understood correctly, I agree with @Shegnada that allowing users to do things in Paratext that already work in corporate tools should be low-risk, but creating new custom workflows that will only ever work in Paratext sets the community up for literacy and publishing challenges later. I’ve seen this with people boxing themselves into hacked fonts and old macros.

I hope that Paratext can learn to support all of the spaces, so that the Paratext project can be the gold standard.

The first challenge is to accept the spaces and allow them to pass through to DBL and digital/print publishing. Maybe I’m an optimist, but mobile apps, inDesign, and HTML shouldn’t be a problem as these are Unicode glyphs in the suggested fonts. TeX (PTXPrint) will require a minor bit of preprocessing, but tools exist in TeX to manage this. If downstream tools like YouVersion need to learn to use advanced spaces/breaks, that’s a discussion worth having.

The second challenge is making it easier to work with advanced spacing in Paratext. I would LOVE to see grey squares for the non-breaking spaces. The punctuation tool might just work as it already shows Unicode values for combinations.

0 votes

I think as an initial proposal, we could ask Paratext to add in the Project View menu a “Show hidden formatting” option. When you do that in Word, you get the following for a series of three spaces, three NBSPs and three NNBSPs (202F):
image
In LibreOffice Writer you get:
image

LO Writer doesn’t show NBSPs, and neither of them shows NNBSPs. To be the gold standard, we would want to do that. But you also don’t want to have a different symbol for each possible hidden formatting, so would we move to showing the character code, except for a couple of major characters like space and NBSP (which would have a symbol), maybe in a small diagonal pattern? How about something like this:

I think it’s helpful to show hidden formatting in a different color. Matthew_Lee suggests gray, LO Writer uses blue, Word just keeps using black. I also like the gray idea, but the trick will be getting the right shade of gray, so that it is visible but subtle.

Obviously if we display a character code, then all bets are off for the actual width of the character. That’s also the case for hidden formatting displayed in Word or LO Writer.

As for the Paratext “simplification” of spaces, I would propose that Paratext continue to condense multiple spaces into a single space, but ONLY for actual spaces U+0020. Any other spaces or hidden formatting characters would be maintained.

Eventually we would probably want to have some shortcuts for TYPING those hidden formatting characters directly in Paratext as well, but for now, we can count on AUTOCORRECT.TXT and/or Keyman keyboards for typing those characters.

OK, so that’s one idea, thrown out into the arena… What are the pros and cons? What other ideas do you have?

by (1.3k points)

This is what I meant by LibreOffice (6) showing NBSPs as I remembered: This is not even is show all character mode, just a normal view. I believe this is the default. Is it not the same in LO 7?

image

You are correct that it doesn’t show NNBSPs (even in show all mode, but we do get handy countable dots for NBSP and space.

image

jeffh’s diagonal proposal is elegant, but we’d need a font with those letters. There are existing fonts that use an alphanumeric square to show the Unicode values of fonts.

image

I do fear that allowing duplicate non-normal spaces will result in multi-space indentation just as people already abuse the possibility in Word, but it would line up with other web standards flowing of text.

I’ve had NBSP on my keyboard for years, but I also have things like dagger, copyright, and empty circle.

There was an option turned off in my LO Writer 7 configuration in Tools - Options - LibreOffice Writer - Formatting Aids, the option Non-breaking spaces was turned off. With it turned on I do get the gray square you were talking about:
image
But not a dot…

+1 vote

This is a very encouraging conversation. It’s great to hear about the needs and potential solutions in the context of Paratext and other tools. I expect this is something which would be “useful for many” and as such likely to considered ahead of some less useful features.

(As a side note - because I think I saw a few mentions of how to keyboard some characters which don’t appear on standard keyboards … For those who don’t already know, keyboarding unusual characters without a third party application can be made easier by using the Character Map application in Windows. When you click on a character in the character map, it shows a “Keystroke” shortcut in the bottom right of the application for many characters. That shortcut can be keyboarded by holding Alt and typing four numbers from the numpad (not the numerals above the letters). For example: Alt+0160 types a no-break space (U+00A0), Alt+0169 generates the © symbol, en-dash is Alt+0150 –, while em-dash is Alt+0151 —.)

by [Moderator]
(1.1k points)

reshown
0 votes

Just one comment in regard to digital publishing through the DBL. The Paratext uploader strips out no-break spaces when creating the USX bundle that we share with publisher. Unfortunately, when a text is being shared digitally, no-break spaces have historically caused issues.

by (192 points)

And what about NOT using no-break spaces?! Here is a random page from the Parole de Vie, a well-respected French translation as viewed in YouVersion, a well-respected Bible app:

Note the broken (problem) punctuation breaks that are highlighted. I cringe every time I see that, and I see it a LOT on texts pulled out of the DBL precisely (I imagine) because valid no-break spaces have been converted to regular spaces.

If we handle no-break spaces well in Paratext, then I would think that the uploader shouldn’t be stripping them out when uploading to the DBL. So let’s do it!

I agree that they “have historically” caused issues. This would have been true across the board pre-Unicode, but if content-producers and downstream publishers still haven’t learned to support special spaces, it’s about time they did.

Stripping out NBSP nowadays is a mistake, as the display technologies that we use all have a process for handling properly-encoded spaces (HTML, XML, TeX, inDesign), and these spaces are part of the style guides for many of the world’s majority and minority languages. Any code that is under Paratext has the possibility of being changed, including internal display and USX export. TeX and SAB already handle all these spaces, or else the print-draft-changes hack wouldn’t work in print draft or ptxPrint (maybe someone from PTXPrint can weigh in). If the USFM and USX standards currently disallow no-break spaces, they will need to be amended.

I knew that this change would need to be made to the whole pipeline, but that doesn’t make it less important. Paratext, Chorus, USFM, USX, and more will need to stop stripping them and start supporting and displaying them. I suspect that the main edge cases will be if someone chose to replace EVERY space with NBSP and overflowed a line.

Here in Cameroon, I still talk about the IPA characters in the language as “special characters”, but with the wide support we have, one of the linguists here recently reminded me that we should just be calling them “characters”. The tools that can’t support a wide variety of characters in text are becoming fewer and far between. The final frontier seems to be supporting special characters in folder names for command-line Windows software. Windows has supported this for years, but things still get mangled.

~Matthew_Lee

Hi jeffh,

I’ve sent your concern to our friends at YouVersion…although I believe the process to convert no-break spaces to regular spaces is done in the Paratext uploader, not on YouVersion (or any other publisher’s) end.

Thanks for connecting with the YouVersion folks @anon175865. Yes, as you mentioned, I imagine that any no-break spaces are already gone in the DBL, stripped by the Paratext uploader. So it’s not their fault. What @Matthew_Lee and I are saying though is that we need to fix our pipeline, so that Paratext is comfortable with and can easily handle these special spaces, and the uploader will not strip them out. Then they will be there in the DBL, and when YouVersion uses those texts, they will appear properly on the screen.

+1 vote

WSTech discussed some of the issue in this thread, and I wanted to send a summary:

  • I suspect that the very large tilde that is being shown (for NBSP) is from the Charis SIL font. If a different Latin script font is used, does the size of the tilde change?
  • There are fonts that automatically adjust spacing around punctuation marks (as needed in Francophone areas) but seem to be rare. So putting in the needed spaces seems the best approach.
  • Using a separate font to show Unicode values of spaces (that is, not the main font used for the text) should work.
  • PTXprint can handle all the various space characters.
by (185 points)

WSTech has also recently adjusted the width of spaces in our fonts. For backwards compatibility, the Latin script fonts (and maybe some others) do not follow our new recommendations for all spaces, only some spaces.

I’m encouraged to have heard from so many people, including WSTech and Paratext. How do we move forward on this, does it need to be written up as a feature request and go through the normal prioritization process?

The big issues (that can be handled separately) are:

  1. Allow NBSP and similar characters mentioned in this thread to exist throughout the PTX/USFM/USX/DBL pipeline (and deal with the display issues that arise as needed). This is the first and most important hurdle. Then we can “fix” the spacing in individual projects moving forward.
  • These characters need to display properly in PTX standard and previews.
  • Be recognized individually in Punctuation and Character checks.
  • Be accepted in Quotation and Number settings (note that FLEx uses dots to show spaces in Configure Dictionary).
  1. Provide an in-Paratext way to visualize these characters.
  • Special temporary fonts have been a suggested possibility.
  • Always-on grey squares have been suggested.
  • A show-all characters feature similar to Word/LibreOffice.
    • One of my users suggested that that the Show/hide feature opened up a dialogue (similar to the Basic Checks dialog) allowing you to specify which special characters to show (Normal spaces, NB spaces, joiners, non-breaking hyphens, bidirectional markers, soft and hard returns). I can imagine cases where a technician wants all markers highlighted (which I often do in external tools), as well as situations where the team only needs “special” markers highlighted.
    • image

~Matthew_Lee

Yes, that would be the best option. Linking this thread in any feature request would be helpful too.

For anyone unfamiliar with the process of making a feature request - it’s available to all Paratext users. From the main menu of Paratext, select Help > Give feedback and select the Make a suggestion... option in the form which appears.

If anyone feels a feature request is especially important or useful, it’s worth making your area or organisational representative for Paratext aware of this. They may choose to present it in the quarterly Paratext prioritization meetings.

+1 vote

This is has now been submitted as a feature request. This thread was referenced in the report.

Relevant reports:
https://paratext.myjetbrains.com/youtrack/issue/PTX-22626
https://paratext.myjetbrains.com/youtrack/issue/PTUX-1318
https://paratext.myjetbrains.com/youtrack/issue/PTX-22623

by (231 points)
reshown

Another comment I added inside YouTrack:

Invisible characters should be listed in Character and Punctuation inventories, which will be our best indication that they do exist and in which contexts they exist. They also need to be allowed as valid separators in Quotation settings ([NBSP]»), Scripture reference settings (1[NBSP]Kings), and Number settings (10[NBSP]000). This would cover many use cases.

0 votes

I am late here, just discovered the subject.

I give my +1 or rather my +100000 for those proposed features. I work for another language, where for historical and co-existing reasons the orthography tries to be as close to French as possible.

I would remind everybody to read-up on French, before making any technical changes. French typography is even more complex than lay people know. Yes, there are spaces around certain punctuation characters, but they are not the same. The space in front (left) of a colon should be full-size, for example.

I have a document, collected from several sources and the main source is sadly no longer online.

There are helpful paper books available too, like “Lexique des règles typographiques” en usage à l’imprimerie nationale" by imprimerie nationale (de la France) and “Règles de l’écriture typographique du français à l’usage des personnes qui exercent une activité sur MAC ou PC” by Yves Perrousseaux.

by (855 points)
reshown
0 votes

So far we are fully using and embracing the tilde in PT, like other people cope with XeTeX and have my admiration — but not my desire to be like them.

The logic of frequent publishing by app and by scripture-portion is also relevant for our context. And the tilde must go soon.

Also more fonts need to provide the narrow-non-breaking. Seems not even all SIL fonts do. Please yell at me, if they do; that would be good news to me, well worth a yelling.

For entering tilde now (NNBSP hopefully soon), I have invented a nifty feature for PT, where I use the inbuilt autocorrect.txt feature. For example to get tilde plus an exclamation mark, I hit three times the exclamation mark and PT does the rest in the correct way. I have this set-up for all those punctuation that need special care.

(And I also use this to enter frequent words like Abraham, Jesus or Jerusalem.)

by (855 points)
reshown
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Live in harmony with one another. Do not be proud, but be willing to associate with people of low position. Do not be conceited.
Romans 12:16
2,616 questions
5,350 answers
5,037 comments
1,420 users