How best to benefit from the lexicon?

Question

In each project folder there is a file Lexicon.xml (at least if some back-translation has been done).

Also in many projects there are foreigners involved (consultants, facilitators) who could rather benefit from a “lexicon”, even a humble one.

I have done some thinking and some research to see how to easily “look into the PT lexicon”. My dream would be to have a very humble Python-plugin (made by me; well, in my dreams I can) that brings up what has been entered upon right-click on a word. And optionally a lexicon-window, where one could browse or search…

I have also started looking into dedicated XML tools and editors. Some can turn an xml file into a rather pretty table. But I found the structure of the PT lexica is not helping such tools by having the actual “lexemes” stored rather like “attributes” not so much like a “content”. And the tools I have seen so far (the affordable ones) also do not like the fact that some lexemes have more senses than others, i.e. that a suitable table-definition would need a structured system of index, entry, indixes and more entries.

Before I spend more time on this, I would like to hear from other supporters and users about what is out there already, what other people are doing and if any fitting tools are coming to mind.

The project I work most with, does not want to link PT and Flex because those two tools evolve (here) through versions (and re-namings) at different speeds. This is a historical decision, due to bad experieces from orthography changes, program-migrations and other reasons. There is offline-transporting of PT data into Flex, but never just-in-time so “looking up through Flex” is one option only, if keeping a forked special-Flex for that purpose on the same machine as the production-Flex is possible.

This is a very open question, no urgency for once, but any input muchly welcome. Thank you.

Paratext Sep 21, 2018 asked by Tim (855 points)
Sep 21, 2018 reshown

1 Answer

Hi and thank you. I am definitely interested. The proof is that I even wanted to write something similar myself. I am learning Python for the past few years with that motto “automate the boring stuff”. If I can look into your code I might even learn a lot…

(Getting closer to having my own first tool with a GUI: “Hyphenator”, which will take data from the PT8 hyphenated words and apply soft-hyphens to any other text; which should be very useful for publishing via Scribus or LibreOffice for minority languages. I found that PT does good hyphen-data-management (even keeping track which entries are user-confirmed) while Flex does not allow custom-fields at wordforms-level (where hyphenation-data needs to be.)

Anyway, I am not at all a real programmer, but I am not afraid to follow instructions and to do a manual install or some mild hacking if needed. Thanks.

Feb 21, 2019 commented by Tim (855 points)

How best to benefit from the lexicon?

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Categories