Experiences Translating Botanical Texts

Ophrys lutea x speculum, F Montagne de la Clape 13.4.01 Ophrys lutea x speculum, F Montagne de la Clape 13.4.01

I never learned the French, Spanish or Italian languages, but sometimes I need at least partial translations of works from foreign orchid magazines. Mainly there was the need for translations from French-spoken magazines as e.g. L'Orchidophile or Naturalistes Belges.

Since however the complicated structures of grown languages can't be pressed into simple rules, perfect translations are still today a big problem for computers and their programmers. My experiences can be described in five chapters:

  1. My first idea, to use the free automatic 'Babel Fish' Translation Service of Altavista in the Web, was then contrary to my expectations not the solution of all my translation problems. Translation from French to German is available, Spanisch and Italian can only be translated to the English language. Directly only text pieces of up to 150 words can be input, translation of web sites up to 5 KBytes size is possible.
    But my experience is, that in long sentences consisting of many parts separated by comma or semicolon, which can especially be found in descriptions of species, often adjectives and adverbs are shifted to remote parts of the sentence by the translation software, leading to translations with completely distorted meaning.
  2. As a result of this experience in the next time I used only word-to-word translation (substitution), which leaves the construction of the original sentence unchanged. The results are often strange sounding sentences, but the contents are usually understandable. If a 'polished' result is required, it has to be reformulated manually then, which is often a time-consuming work.
    As tool for the substitution I started with TOLKEN99, a swedish shareware which can be tested freely for 30 days. (In the last months there were sometimes problems with this link, but the program can still be loaded from several distributors via internet, usually they don't offer all the existing dictionaries, which I can send via EMail if needed). TOLKEN99 works in two passes: First it substitutes 'phrases', i.e. combinations of several words, which sometimes have a meaning different of that of the individual words; after that it substitutes the remaining text word for word, as far as the words are in the dictionary.
  3. Orchis mascula x pauciflora, I Assisi, Monte Subasio 17.5.05 Orchis mascula x pauciflora, I Assisi, Monte Subasio 17.5.05

  4. The next step was, to re-program the substitution tool myself, to adapt several things to my requirements. Work which previously had to be done manually was now automated, e.g the correct handling of capitalized nouns, which are usual in German contrary to many other languages, but not supported by TOLKEN99. Before the translation some specific shapings for the French language are executed, which prevent the library from growing unnecessarily big (e.g. separation of prefixes shortened by apostroph in front of any word, like "l'origine"). I omitted many abilities of TOLKEN99, which I currently don't need and which partially would have been complex to be programmed. But the functionality was also extended, e.g. in the way that it is now able to repair some of the weaknesses of my flatbed scanner and text recognition program.
    The main problem is still the same as with TOLKEN99, the vocabulary libraries. I have been busy to update the French-German library with many general and botanical words, with the Spanish and Italian libraries I stand however still completely at the beginning.
  5. The Google Translation Service seems to be built on the same base as the Altavista Translation Service mentioned above, i.e. it delivers the same partly wrong principle structures in the sentences, but for several words it delivers context-dependend better translations. Many botanical terms are still missing in the library. As with Altavista translation from French to German is available, Spanisch and Italian can only be translated to the English language. A small amount of text can directly be input, but the translation of web sites of up to 10 KBytes is working well. For that reason I put longer texts temporarely into the web for translation, in case of 'Naturalistes Belges' essays in pieces of about 130 lines.
  6. Currently I'm combining the result of the Google Translation Service, which already often chooses the context-dependent best translation for words with more than one meaning and which is already building correct sentences if they are short, with the result of my own substitution tool, which delievers the original sequence of the words in the sentence, and which I'm feeding with translations for more and more vocabulary. After the manual revision of the translated text, which can't be omitted, I have matching translations with not too much expense of time.

Or does someone know a more comfortable, but more or less free method to get errorless translations of botanical texts ?  In this case I would be glad about an e-mail to know how it's working ... (please write the address out, because of the SPAM robots)

Contents Rev.: 9.Jan.2002
Copyright: Use of the images and texts only with author's written permission.