0 votes

We are trying to import some pre-Unicode Scripture files into Paratext and running into issues finding the “correct” Unicode character for some of the representations that they chose. Unfortunately, some of them are fairly un-standard.

Here’s an example of the text as it currently is:

\v 16 No hua O:sa niyala: Dyisasi Kalasila: sa ma: he: /yomela: gele sa, gya dimo: a:la:we:. A:da:ka: ni ko:bo:xa homa:we: sa ma eiya:we: sa ma: aba/ale:. Ni hua hyala: hixitixinowa ke:lamo: niyala: ma: da:la:we:.
\v 17 Da: hixitixi God, hi hyala: ko/o:me:ya: ma: ko:la: hyala: sikasikale:la:me:ya:. A: gele ba: Da: hixitixinowa a:la:, “Yoba:yo kixe nowala:. No hya dimo: hixitiximo: atae:yahya ma: ho: hyala: du:lu pa:la:xa hya.”
\v 18 Ni Kalasi ma: o: /yaso:mo: halo ba: hya:ka: ni a: sa God he: to:lo:me:ya: yo dae:ya:we:.

Note, I have not converted anything to Unicode yet, so this has
-Underlines in the font instead of macrons/Unicode.
-A regular punctuation colon (U+003A).
-A regular forward slash (U+002F), which can occur at least word-initially and word-medially.

My questions are regarding those characters:

Underlined vowels are nasal. da:lixa. He: yo huanowa hyala: ma: da:lixa.
-We were thinking of using U+0331 (Combining Macron Below) for this. I also saw U+0320 (Combining Minus Sign Below), but that doesn’t seem like the right option.

Vowels with a colon are a different vowel than vowels without a colon.
-It seems that U+A789 (Modifier Letter Colon) is the best option for this.

The forward slash is a glottal mark.
-I looked at U+0338 (Combining Long Solidus Overlay), but that seems to be a diacritic rather than an actual “letter”. I’m looking for an actual word-forming character in Unicode.

Any ideas?

Thanks,

SIL+LSS+PNG

[cid:[Email Removed]]

SIL+LSS+PNG
Language Technology Consultant
SIL PNG Language Resources
Ukarumpa EHP 444 | Papua New Guinea
[Email Removed].pgmailto:[Email Removed].pg

image

Paratext by (411 points)
reshown

3 Answers

0 votes
Best answer

U+0331 (combining macron below) seems reasonable for replacing an
underline. Other options, depending on how close you need to stay to the
existing representation, might be U+0330 (combining tilde below), or
even U+0304 combining macron [above] or U+0303 combining tilde [above].

U+A789 (modifier letter colon) is definitely better than using a colon.

For discussion of the glottal, check out “Options for Representing
Glottal Stops in Orthographies”
(http://scriptsource.org/cms/scripts/page.php?item_id=entry_detail&uid=hgdskm8ld2).

If the orthography is under discussion, you may want to consult with
Mike Cahill, Orthography Services Coordinator.

anon806807

by (296 points)
reshown

Thank you for your reply rowbory, I appreciate hearing from you.

Unfortunately, this orthography does not seem to be under discussion right now. It’s a few decades old. So the forward slash / glottal is really my biggest issue remaining.

SIL+LSS+PNG

image

Do you think they might accept the upper case "saltillo"
Ꞌ u+a78b (larger than the lower case version ꞌ) as similar enough to the forward slash to make the transition? I think that / might interfere with certain formatting syntax or other computer commands.

Both of these have even larger variant glyphs in the SIL fonts. But if there is also an ‘Ll’ in the language I think there might be some confusion.

Just thinking…
KimB

0 votes

SIL+LSS+PNG,

The forward slash is a glottal mark.
-I looked at U+0338 (Combining Long Solidus Overlay), but that seems
to be a diacritic rather than an actual “letter”. I’m looking for an
actual word-forming character in Unicode.

The are no “word-forming” characters in Unicode that look like the
forward slash.
This page on ScriptSource shows you some of the look-alike characters
for U+002F: http://scriptsource.org/char/U00002F
One option is U+2044 (fraction slash) and another is U+2215 (division
slash). (SIL fonts have U+2044 but not U+2215.)

That page doesn’t mention U+2E0D (http://scriptsource.org/char/U002E0D)
or U+2E1D which might be reasonable alternatives (although Unicode
considers them each to be one of a pair).

Personally I don’t think any of these are good solutions for an
orthography, but if the orthography is already in place then you just
need to figure out the best of the worst options.

U+A78C http://scriptsource.org/char/U00A78C was added for this purpose,
but it doesn’t look like a slash.

U+02B9 http://scriptsource.org/char/U0002B9 is a prime mark and that
might also be a good option. It is a modifier letter so it would not be
treated as punctuation or as a combining mark.

Lorna

by (329 points)
reshown
0 votes

In the meantime you could always use the saltillo for the / in Paratext but stick a printdraftchanges.txt rule in to change it to the normal / before printing.

I would check how the modifier letter colon appears in the fonts you want to use. I have found the modifier letters to be often rather too small since they are intended to be like diacritics but on the letter line. Sadly this ends up ruling out many otherwise useful grammatical punctuation-type characters.

rowbory

by (506 points)

Related questions

0 votes
2 answers
0 votes
3 answers
Paratext Feb 27, 2023 asked by [Moderator]
james_post
(2.0k points)
0 votes
3 answers
0 votes
0 answers
Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
For where two or three gather in my name, there am I with them.
Matthew 18:20
2,648 questions
5,397 answers
5,069 comments
1,449 users