Aaron Kavlie

Comparing internationalization in web frameworks

I've spent the better part of the last six months doing front-end development on a Java project developed with the Spring framework. We're currently going through the process of internationalizing the application. This necessitated spending a fair amount of time combing through all the raw English strings in templates and replacing them with Spring directives, to be filled in by the .properties file for the user's locale.

While there is a lot more to internationalization than replacing static text strings, it's the first (and perhaps largest) task to tackle. Here's what that process looks like.

Let's start with a simple template like this:

<ul>
    <li><a href="${welcomeurl}">Welcome</a></li>
    <li><a href="${registerurl}">Register</a></li>
    <li><a href="${contacturl}">Contact Us</a></li>
    <li><a href="${abouturl}">About</a></li>
</ul>

To internationalize it, you replace all of your raw text with spring:message directives as follows:

<ul>
    <li><a href="${welcomeurl}"><spring:message code="common_site_welcome"/></a></li>
    <li><a href="${registerurl}"><spring:message code="common_site_register"/></a></li>
    <li><a href="${contacturl}"><spring:message code="common_site_contact"/></a></li>
    <li><a href="${abouturl}"><spring:message code="common_site_about"/></a></li>
</ul>

Ugh, all of my readable text replaced by XML. OK then.

We're not done yet; as mentioned, some .properties files need to be created to fill in the proper text strings. So for US English, you would create i18n/messages_en_US.properties as follows:

common_site_welcome=Welcome
common_site_register=Register
common_site_contactUs=Contact Us
common_site_about=About

and for Spanish, i18n/messages_es.properties as follows:

common_site_welcome=Bienvenido
common_site_register=Registro
common_site_contactUs=Contáctenos
common_site_about=Sobre

and if everything is configured correctly, you now have Spanish localization for those strings. Congratulations.

Enter Flask

That's not where the story ends. This is a public-facing application, and that means there's a marketing site to go with it. It's currently just static html, so I thought it would be best to use a simple web framework for proper i18n support (among other things). I turned to my go-to micro-framework, Flask, and set up a small sample app to test out the i18n support.

Flask (along with Django and many other web frameworks) uses the standard GNU gettext API for i18n. Here's how it works. We start out much the same as in the Spring example, except with Jinja2 templates and Flask's url_for() function:

<ul>
    <li><a href="{{ url_for('welcome') }}">Welcome</a></li>
    <li><a href="{{ url_for('register') }}">Register</a></li>
    <li><a href="{{ url_for('contact') }}">Contact Us</a></li>
    <li><a href="{{ url_for('about') }}">About</a></li>
</ul>

now to internationalize that template, we just do this:

<ul>
    <li><a href="{{ url_for('welcome') }}">{{ _('Welcome') }}</a></li>
    <li><a href="{{ url_for('register') }}">{{ _('Register') }}</a></li>
    <li><a href="{{ url_for('contact') }}">{{ _('Contact Us') }}</a></li>
    <li><a href="{{ url_for('about') }}">{{ _('About') }}</a></li>
</ul>

Very cool. I just wrap all strings with a short function (_() is the common alias for gettext()) and they're ready for localization.

Now what about messages? For that, we'll turn to the Flask-Babel extension. After a short config file, we can do this to extract all i18n strings:

$ pybabel extract -F babel.cfg -o messages.pot .

That will give us a file with all the extracted strings from our Jinja2 template, along with some header info. What's interesting here is the extra metadata you get with each string:

#: templates/index.html:8
msgid "Welcome"
msgstr ""

#: templates/index.html:9
msgid "Register"
msgstr ""

To generate a message file from messages.pot for a given language (Spanish again in this example), follow with this:

$ pybabel init -i messages.pot -d translations -l es

The resulting file looks much the same as messages.pot. This is the file to send off to the translator.

Taking stock of the two approaches

I see a number of benefits to the approach employed by Flask (and by extension, other frameworks that use gettext tools):

  1. Templates preserve the original English -- that makes them easier to convert, more readable, and more searchable.
  2. The Flask approach adds 11 total characters to internationalize a string; Spring adds 25 (and that assumes the variable is the same length as the original English. In a larger project the tendency will be to make it longer for pseudo-namespacing as in the example.)
  3. Spring requires the programmer to define variables for every piece of static text, and manually add them to a properties file. Flask extracts strings with a single command.
  4. Flask's messages.po files include the file & line number the string came from, as well as the original (English) string. The .properties files in Spring just have a variable name.
  5. In Flask, added and changed strings can be extracted by running a pybabel update command. This will add the new strings to all translations, and mark changed text as "fuzzy". In Spring, you have to manage additions and changes across all files by hand.

I'm not able to come up with much in favor of the Spring approach honestly. The gettext method seems a lot easier and more maintainable, especially as the size of the project and number of supported languages increases.

comments powered by Disqus