Pagina 1 de 1

usage of TextSTAT tool to create rhyme dictionary

Message publicPublicate: mer. jan 29, 2014 8:05 pm
per esra
Salute,

TextSTAT also could be base for creation of Interlingua rhyme dictionary. That tool contains such called "retrograde sorted (frequency list)". Use export feature to create Excel or CSV files.

So words alphabetically are sorted by its endings to the beginning i.e.

ma, 7(=frequency)
ama, 4
dama, 1
ferma, 3
forma, 1
campana, 2
cocina, 1
cosina, 1
Johanna, 1
penna, 2

Amicalmente,

Re: usage of TextSTAT tool to create rhyme dictionary

Message publicPublicate: dom. feb 02, 2014 10:14 am
per bangiolo20
Gracias.
BTW, do you know antconc or any other tools for translators ? I'd like to create a hand-book dictionary just for my own. I use Lex, but it is very simple.

Re: usage of TextSTAT tool to create rhyme dictionary

Message publicPublicate: dom. feb 02, 2014 10:38 am
per esra
I only know SIL International's tools. I tried to use FieldWorks which usage is not such easy. I hope that their video tuturials will help me to dive into usage of FieldWorks.

I also tried Lexique Pro which contains bug which hinders to create own categories.

I also tried SIL's Wesay which interface usage is catastrophical. I don't understand how it works properly. WeSay's maintainer claim that they created dictionary builder tool which is easy to use. But in my opinion WeSay makes matters more difficult than it has to be.

I tried SIL's Toolbox which interface usage is catastrophical. I only found one video tutorial (Spanish) which could help but I didn't find out how to pass.

To make it professional with use of some freeware I further would recommend SIL FieldWorks. FieldWorks dictionary (pdf) output seems to be professional. FieldWorks is not Vocabulary Flash card builder. FieldWorks is suite to document some language according linguistic standards.

Thanks for your software hints to create (pdf) dictionary output.

Re: usage of TextSTAT tool to create rhyme dictionary

Message publicPublicate: dom. feb 02, 2014 10:49 am
per bangiolo20
Hm... ? I don't remember I give any software hints. :D Only yesterday I figured out at least how to use TextStat. Heh... it hates diacritics. :P

Re: usage of TextSTAT tool to create rhyme dictionary

Message publicPublicate: dom. feb 02, 2014 11:41 am
per esra
bangiolo20 scribeva:Hm... ? I don't remember I give any software hints. :D


bangiolo20 scribeva:Gracias.
BTW, do you know antconc or any other tools for translators ? I'd like to create a hand-book dictionary just for my own. I use Lex, but it is very simple.


I thought that "antconc" and "Lex" are tools.

bangiolo20 scribeva:Only yesterday I figured out at least how to use TextStat. Heh... it hates diacritics. :P


Hhm. No good. By default TextSTAT "Encoding" is set to UTF-8 Unicode. I tried different encodings. It doesn't help.

diacritic marks.

Update: TextSTAT remains to have malfunction with Diacritic marks. I will inform the maintainer of TextSTAT again.

Re: usage of TextSTAT tool to create rhyme dictionary

Message publicPublicate: mer. feb 05, 2014 8:43 pm
per esra
bangiolo20 scribeva:Hm... ? I don't remember I give any software hints. :D Only yesterday I figured out at least how to use TextStat. Heh... it hates diacritics. :P


Solved.

Try to import plain text *.txt, *.doc or *.docx files only. An non-proper xml markup inside LibreOffice and OpenOffice's *.odt files splits the regarding word inside the *.odt file xml markup. So, its ODT file format "bug". That only happens if you modify the ODT file. If, i.e. you only copy text without any further modifications that problem will not occour and TextSTAT will process proper output.

You could check yourself. ODT files are zipped xml files. You could extract it and could take a look at the affected words inside the xml markup.

===

There excists bug inside LibreOffice's/ OpenOffice's xml markup. If some words will be set with diacritics that regarding word will capsulated into two parts with some new xml markup. That new xml markup let look that regarding word like two ones. So, inside the xml source view the regarding word is splitted, inside non-source view that word looks like to remain one word.

Inside TextSTAT try to import plain text *.txt, *.doc or *.docx files only.