Checking missing translations automatically

By Martin Burchell on 26 July 2011

For our open source openconsent project, which uses the Django framework, we have recently added internationalisation support. Here's how we're testing it.

Before any translations are in place, it's difficult to ensure that all text is appropriately tagged for translation, either with {% trans %} tags in templates or using gettext() and its friends in the code. Checking missing translations by eye is time-consuming and prone to error.

Inspired by the article Mocking gettext with Django Translations to test that your code is translating by Rory McCann we wrote an automated test to do this:

# coding: utf-8

from publicweb.tests.open_consent_test_case import OpenConsentTestCase
from django.core.urlresolvers import reverse
from django.utils import translation
from lxml.html.soupparser import fromstring
from lxml.cssselect import CSSSelector

class InternationalisationTest(OpenConsentTestCase):

    def setUp(self):
        self.login()

    def test_all_text_translated_when_viewing_decision_list(self):
        self.check_all_text_translated('decision_list')

    def test_all_text_translated_when_adding_decision(self):
        self.check_all_text_translated('decision_add')

    def check_all_text_translated(self, view):
        self.mock_get_text_functions_for_french()

        translation.activate("fr")

        response = self.client.get(reverse(view), follow=True)
        html = response.content

        root = fromstring(html)
        sel = CSSSelector('*')

        for element in sel(root):
            if self.has_translatable_text(element):
                self.assertTrue(self.contains(element.text, "XXX "),
                                "No translation for element " + \
                                str(element) + " with text '" + \
                                element.text + \
                                "' from view '" + view + "'")

    def has_translatable_text(self,element):
        if element.text is None or element.text.strip() == "" \
            or "not_translated" in element.attrib.get('class', '').split(" ") \
            or element.tag == 'script' \
            or element.text.isdigit():
            return False
        else:
            return True

    def contains(self, string_to_search, sub_string):
        return string_to_search.find(sub_string) > -1

    def mock_get_text_functions_for_french(self):
        # A decorator function that just adds 'XXX ' to the front of all
        # strings
        def wrap_with_xxx(func):
            def new_func(*args, **kwargs):
                output = func(*args, **kwargs)
                return "XXX "+output
            return new_func

        old_lang = translation.get_language()
        # Activate french, so that if the fr files haven't
        # been loaded, they will be loaded now.
        translation.activate("fr")

        french_translation = translation.trans_real._active.value

        # wrap the ugettext and ungettext functions so that 'XXX '
        # will prefix each translation
        french_translation.ugettext = \
            wrap_with_xxx(french_translation.ugettext)
        french_translation.ungettext = \
            wrap_with_xxx(french_translation.ungettext)

        # Turn back on our old translations
        translation.activate(old_lang)
        del old_lang

We mock the French ugettext() and ungettext() to prefix any translated strings with XXX. Our automated tests now just need to ensure that any text on the page begins with XXX.

There are two tests in this class, one for each page that we want to check. These both call the method check_all_text_translated(). This sends a GET request for the given view. We use lxml to parse the response. The CSS selector '*' will return us all elements.

Because our database is empty when running these tests, we can be sure that pretty much all of the text nodes should be translated. There are a number of exceptions that we filter out in the method has_translatable_text()

White space
JavaScript
Numbers
Anything in a tag with class "not_translated"

The last category is a bit of a hack as it isn't really used other than in our tests. We couldn't think of a way around this. There are only a couple of places where we need to do this, for example when displaying the user name of the logged in user.

If none of these exceptions applies and the text does not begin with XXX, we ensure our test fails with plenty of information to track down the missing translation.

Blog

Blog

Tags

Checking missing translations automatically