Pseudolocalisation with podebug (3): Interview with Rail Aliev

This discussion is closed: you can't post new comments.

This is part of my series on podebug. Last time we looked at identifying strings from different locations. This time I decided to conduct an interview with Rail Aliev who has been a major user of and contributor to podebug. I was specifically interested in his use of the hashing functionality of podebug, but hopefully you will also get to know this major contributor to the world of Free and Open Source Software.

Please tell us a little bit about yourself and your involvement in the world of Free Software.

I'm mostly involved in localization efforts (Russian and Turkish) in OpenOffice.org and Mozilla projects. Zemberek linguistic project (Turkish) is another interesting one. As result I maintain some linguistic packages in Debian.

And the last but not least, the Translate project.

Please tell us about your preferred translation environment and how things work in your teams.

Currently I use the following tools:

  • translate-toolkit for format conversion (oo2po, oo2moz), testing translations (punctuation, spacing, expressions, etc), debugging (podebug) and other off-line helpers
  • Pootle for collaborative translation, quality assurance (web interface for translate-toolkit's features), translation repository
  • Virtaal for translation. Virtaal provides great features, such as on-line translation memory queries from OpenTran project, automatic translation using libtranslate. And all these features in version 0.4! Of course, there are a lot of features to be implemented, but, as I said, this is a good start.

Tell us a bit about how you use podebug, and specifically the –hash option. Why do you find it useful?

I use podebug only in cases when you cannot find the string, which was found as a wrong translation.

As a generic example, if you have English “Right” string, you can translate it as “right side” or as “not wrong” (in real world, depending on a language, the list of translations may be extended). Even worse, sometimes you have this string used more than once in a file (it's very common in OpenOffice.org project, where files with thousands words aren't rare). It's the case where we can use podebug's power.

As an example, here are the “magic” commands to get debug version of SDF file, used in OpenOffice.org translation:

  1. Create normal PO:
    oo2po -l tr en-US_tr.sdf po
    (po folder will be created)
  2. Create debug PO:
    podebug -f "%h." po po.debug
  3. Create debug SDF file:
    po2oo -l tr -t en-US_tr.sdf po.debug tr-debug.sdf

The most painful steps are first and latest. You need OpenOffice.org build tree and you need to know how to build it.
The OpenOffice.org interface becomes something like this (click for the full version):
OpenOffice.org built with podebug hashing markers

Now you can definitely find the string you are looking for, first in po.debug folder and then fix the corresponding string within po folder.

Keeping po folder under version control saves your time when you are ready to merge your fixes upstream.

Any other tools/options that you use to compliment podebug –hash?

You can play around -f option of podebug to get a different prefix for messages. For example, -f "%2h: " will add first 2 letters of the hash, semicolon, space and the translated message. The hash is computed using string's comments and context MD5 checksum, so it's stay the same even if English source text is changed.