0 votes

The project runs fine on a stem-based setup for several weeks now.

The language produces many compound words. They can be any mix of nouns, verbs, adjectives and “worse”. Of course there are “rules” on how those elements are combined, and they are logical and each compound word can be parsed fine.

Now I got stuck, trying to tell the Biblical key terms tool about a simple compound word “strongvoice”.

The Greek is μέγας and the full word is ʊwǝlǝkaŋkra which I have in the word-list as /ʊwǝlǝ/ /kaŋkra/ because both parts are full stems which could where needed take many other building blocks. Both stems are also already listed as their own entry in the wordlist. And I have told the Biblical key terms tool that kaŋkra is the rendering for μέγας (strong, great, large etc.)

I do not get “a hit”, and this is a case of two stems rather than stem plus affix(es). This is a very mainstream case and we will need to list many more compound words for this project. And does (in PT8 speak) stem-based mean that there can only ever be one stem for the magic to work?

I am probably doing something wrong (again). I learnt lots from and enjoyed reading the other recent discussions about the basic concept and practises and on the hierarchical (or rather not) workings of the inbuilt morpholoy tools. So just hit me. I like these tools in PT8 a lot, I just need to apply them properly to this specific language and get the most out of it.

(Bonus question: And what/how will we do, if the meaning is ever contained in an affix rather than in a stem? Example: Imagine a language where they say “overmother”, with “mother” the stem meaning mother and “over” the prefix meaning μέγας.)

Paratext by (842 points)
reshown

4 Answers

0 votes
Best answer

I think we need to ask what the Biblical terms should do for us. If you have a compound word waterdippull for baptize, you would probably want the tool to find verses which has waterdippull rather than verses with only water or dip or pull. So, would it not be reasonable to indicate waterdippull as one root/stem in the Wordlist, even if you think of it as a compound word? You could then add affixes. For strongvoice, you would want it to find strongvoice rather than strong or voice, I think. We use the borrowed word pastan for baptize which for reasons of vowel harmony has two different roots in the Wordlist, one is pastan, the other pāstēn. Now, the root pastan occurs in 26 different forms because of various prefixes and suffixes. They each have their own morphological breakdown. The Biblical Terms finds all of these when pastan is given as a definition. The root pāstēn occurs in 22 different forms, including baptizer, baptism, I am baptizing.
The Biblical tools has 4 different Greek words. For baptist, we can get by with pāstēn in the definition. For baptize, we have both roots plus a root meaning “suffer”. Another Greek word refers to ritual “baptizing/cleansing”, so we use words for wash, cleanse and baptize as definitions. For baptism we use the two roots.

by (869 points)
reshown

Thank you Iver+Larsen for this question/challenge. I was off desk for days and I need to get back to the project to do some testing and probing. We need to check whether it can be done at the technical level (because compound nouns are not just chained nouns).

And we need to check how much information we would lose with this workaround. In English you can say “baptize in suffering”, “baptize in fire” etc. and you can still clearly see what is what. If we would “give up” our stems (for technical reasons) and create artificial PT8-stems, then we would have waterdippull, sufferingdippull, firedippull and the local team will probably handle it fine. But visiting consultants and the “analysis” would suffer I believe.

Still this is an idea and I will look into it. This season is extremely busy for us, getting ready for continent-change, so my next feedback will not be soon.

This problem is in Paratext since April 2019 and for certain languages, this is a real issue:

Iver+Larsen gave an example where one Greek term is expressed as a compound of several stems. But we also struggle with the case that several separate Greek terms are expressed by just one compound noun in the target language:

Consider the case of beloved Son: There are two entirely separate Greek terms here that we both cannot assign to their biblical key terms, because Paratext cannot handle compound nouns.

Compound nouns are not exotic, they are fully mainstream in several languages.

So we have a local term for belovedchild. And the hacks which are discussed above would have me artificially mark either son/child or beloved as affix. But both can properly stand on their own, and mostly do. There are many beloved children here but only a few beloved bicycles or house or whatever. That is why we have a few special compounds but also free standing stems.

Please fix the morphology tool so that words made up of several legitimate stems will be recognized in the biblical key terms tool. They work very well in the wordlist morphology column, as they should. It cannot be so difficult to extend the underlying logic, which is already powerful, to cope with two stems rather than on suffix and one stem, because we as users are explicitely confirming to PT in the word list which combinations are legit.

Thank you for your consideration: Respecting and handling languages “as they are” should be a very high priority, sorted well before any “extra or new features”.

Have you submitted a feature request for this?

No, I have not but will eventually. I have two types of feature proposals:

  • I have a small, useful, self-contained idea and I want to keep the toaster or the fruit-basked all to myself. So those ones I submit directly and enjoy the glory and the rewards, (especially if it gets realized).

  • I have a larger and complex problem, and I am aware that I do not understand enough of the inner workings of PT to propose a new feature or a correction directly: Then I try to explain and descripe the problem and examples here and hope for input from other users.

Several times I am surprised how little feedback I receive. Takes this thread as an example: It is unlikely that only German and one minority language in one area of Africa have this need for better handling of the morphology of compound words in PT. Does nobody else use compound words? Are non-German-speakers afraid to handle it and rather click long lists of surface forms?

Presently I am overwhelmed with daily work, so no time to write-up a proper request for this problem. But it is still bugging me almost every week when working on actual texts and assigning key-terms. Again and again I have to use a hack that I found, and there are still entries, where even the hack does not work at all.

0 votes

Yes, it seems that at the moment the stem-based Biblical terms tool will not accept more than one “stem”, even though the Wordlist accepts more than one stem in a word. Whether this can be changed in a future version of the program I do not know. You may want to send in a feature request for this.
I have had to designate some tense markers and other morphemes as PT “stems” even though they are not stems or roots in linguistic terms. It is also a matter of word divisions. Since our tense markers can stand alone as separate “words” in a verbless clause, I had to mark them in the Wordlist as roots, since Wordlist does not accept a “word” without a stem/root. However, this is not relevant for the Biblical terms, just a way to handle it in the Wordlist for spell check purposes.
If you indicate “strongvoice” as a whole word in both the Wordlist and Biblical terms, it should work, but that obviously means many more entries and definitions.

by (869 points)
0 votes

Thank you @Iver+Larsen for your reply and for sharing from real project(s).

If I understand you correctly, you are saying that compound words are legitimate, but that the Biblical terms tool cannot make the connection (or “find” a rendering), although the morphemes (stems) are properly entered into the word-list. This would explain, why I cannot complete my entry for strongvoice.

If I got this right, I will not just write a feature request, this would be almost a bug report. Compound words are an important part of many languages. And being German, I got special love for them. Our language here in Africa is also a champion in making smart compound words.

Normally I start a fresh thread for each new feature request, to get input from other power users. I still hope I am missing something.

Putting hundreds of complete (surface-form) compound words into a normally stem-based morphology-magic-system feels wrong. Especially since many compound words are nouns, which can in turn take on prefixes, like plural-markers or focalizers. So the PT system needs to fully support them, or it would be chaos.

I could make several hacks and work-arounds, but working with consultants and partners gets ever more complicated with each home-cooked-hack.

In our situation, many “Greek” concepts are new, like baptize (dip in water and pull out again), so guess, what; expressions need to be created. Often an entire phrase is needed to express one new concept. But whenever we manage to create a good compound word, we rejoice, because we can then better show the nouances (waterdippull). We can use the normal local affixes to turn baptize into baptist (waterdippuller), baptizm (a waterdippulling), etc. And such terms are communicating better (in the context of a classic long Luke-sentence) than a multi-word-phrase. You know all that. I am just warming up to explain to the technicians, why compound words are not “weird” and “marginal problem cases” but important mainstream features of many languages. (My examples are real, but they look and sound much more elegant in the actual language.)

by (842 points)
+1 vote

Ok. New feature requests are reviewed regularly, and implementing of new features is planned many months or even a few years in the future, so my encouragement to you would be to submit a feature request earlier rather than later. For instance, you first wrote about this in April 2019 and it’s now May 2021. But without an official feature request to the Paratext team, it likely will not be discussed by the Paratext Prioritization Committee for future implementation.

by [Moderator]
(2.1k points)

reshown

Related questions

Welcome to Support Bible, where you can ask questions and receive answers from other members of the community.
Just as a body, though one, has many parts, but all its many parts form one body, so it is with Christ.
1 Corinthians 12:12
2,476 questions
5,170 answers
4,866 comments
1,282 users