Removing words from hyphenatedWords.txt that no longer exist in the project
I am trying to prepare the hyphenatedWords.txt file for native speakers to work on. I want to remove all words that are actually not in the project. However there are many words in the project that were at one time marked as Correct or Incorrect that no longer exist in the project. Even though thy are now marked as spelling status Unknown, they continue to appear in hyphenatedWords.txt if they have approved hyphenation.
I tried exporting the word list to XML, changing hyphenationApproved to false and importing, however this did not successfully change their hyphenation status. Example:
I changed the hyphenationApproved attribute from true to false for the following entry
<item word="abaaramaadha" count="0" hyphenation="abaara=maa=dha" hyphenationApproved="False" morphology="abaar +amaa +dha" morphologyApproved="False" spelling="Incorrect" correction="abaaramaa dha" />
I then imported this back into Wordlist. Nevertheless, after I opened up Wordlist, the hyphenation was still marked as approved. It appears the only way I can successfully unapprove hyphenation is to individually tick every one of the approved hyphenations in the Word list tool.
Removing entries that no longer exist in the project which are clearly wrong
A second concern is that I cannot remove from the Wordlist database any of the thousands of words that were entered into Wordlist and reviewed over the 10+year lifetime of the project. For example the straight apostrophe was changed to a character modifier apostrophe. We would like to remove those 139 entries. There are even 233 “words” with spaces in them in the database like this:
<item word="a si" count="0" hyphenation="a si" hyphenationApproved="False" spelling="Incorrect" />
It appears that Wordlist keeps a memory of any word that made it into the project and was reviewed. But there is a need to remove words that are no longer legal words in order to make it easier to work with the tool. With over 43,000 words in the project I would love to clean up the Wordlist database and hopefully make it run faster.