How to internationalize a PHP third-party library

1.5k views Asked by At

Consider writing a PHP library, that will get published through Packagist or Pear. It is addressed to peer developers using it in arbitrary settings.

This library will contain some status messages determined for the client. How do I internationalize this code, so that the developers using the library have the highest possible freedom to plug in their own localization method? I don't want to assume anything, especially not forcing the dev to use gettext.

To work on an example, let's take this class:

class Example {

    protected $message = "I'd like to be translated in your client's language.";

    public function callMe() {
        return $this->message;
    }

    public function callMeToo($user) {
        return sprintf('Hi %s, nice to meet you!', $user);
    }

}

There are two problems here: How do I mark the private $message for translation, and how do I allow the developer to localize the string inside callMeToo()?

One (highly inconvenient) option would be, to ask for some i18n method in the constructor, like so:

public function __construct($i18n) {
    $this->i18n = $i18n;
    $this->message = $this->i18n($this->message);
}

public function callMeToo($user) {
    return sprintf($this->i18n('Hi %s, nice to meet you!'), $user);
}

but I dearly hope for a more elegant solution.

Edit 1: Apart from simple string substitution the field of i18n is a wide one. The premise is, that I don't want to pack any i18n solution with my library or force the user to choose one specifically to cater for my code.

Then, how can I structure my code to allow best and most flexible localization for different aspects: string translation, number and currency formatting, dates and times, ...? Assume one or the other appears as output from my library. At which position or interface can the consuming developer plug in her localization solution?

3

There are 3 answers

8
ckruse On

The most often used solution is a strings file. E.g. like following:

# library
class Foo {
  public function __construct($lang = 'en') {
    $this->strings = require('path/to/langfile.' . $lang . '.php');
    $this->message = $this->strings['callMeToo'];
  }

  public function callMeToo($user) {
    return sprintf($this->strings['callMeToo'], $user);
  }
}

# strings file
return Array(
  'callMeToo' => 'Hi %s, nice to meet you!'
);

You can, to avoid the $this->message assignment, also work with magic getters:

# library again
class Foo {
  # … code from above

  function __get($name) {
    if(!empty($this->strings[$name])) {
      return $this->strings[$name];
    }

    return null;
  }
}

You can even add a loadStrings method which takes an array of strings from the user and merge it with your internal strings table.

Edit 1: To achieve more flexibility I would change the above approach a little bit. I would add a translation function as object attribute and always call this when I want to localize a string. The default function just looks up the string in the strings table and returns the value itself if it can't find a localized string, just like gettext. The developer using your library could then change the function to his own provided to do a completely different approach of localization.

Date localization is not a problem. Setting the locale is a matter of the software your library is used in. The format itself is a localized string, e.g. $this->translate('%Y-%m-%d') would return a localized version of the date format string.

Number localization is done by setting the right locale and using functions like sprintf().

Currency localization is a problem, though. I think the best approach would be to add a currency translation function (and, maybe for better flexibility, another number formatting function, too) which a developer could overwrite if he wants to change the currency format. Alternatively you could implement format strings for currencies, too. For example %CUR %.02f – in this example you would replace %CUR with the currency symbol. Currency symbols itself are localized strings, too.

Edit 2: If you don't want to use setlocale you have to do a lot of work… basically you have to rewrite strftime() and sprintf() to achieve localized dates and numbers. Of course possible, but a lot of work.

4
Francisco Presencia On

There's a main problem here. You don't want to make the code as it is right now in your question for internationalization.

Let me explain. The main translator is probably a programmer. The second and third might be, but then you want to translate it to any language, even for non-programmers. This ought to be easy for non-programmers. Hunting through classes, functions, etc for non-programmers is definitely not okay.

So I propose this: keep your source sentences (english) in an agnostic format, that it's easy to understand for everyone. This might be an xml file, a database or any other form you see it fits. Then use your translations where you need them. You can do it like:

class Example {
  // Fetch them as you prefer and store them in $messages.
  protected $messages = array(
    'en' => array(
      "message"  => "I'd like to be translated in your client's language.",
      "greeting" => "Hi %s, nice to meet you!"
      )
     );

  public function __construct($lang = 'en') {
    $this->lang = $lang;
    }

  protected function get($key, $args = null) {
    // Store the string
    $message = $this->messages[$this->lang][$key];
    if ($args == null)
      return $this->translator($message);
    else {
      $string = $this->translator($message);
      // Merge the two arrays so they can be passed as values
      $sprintf_args = array_merge(array($string), $args);
      return call_user_func_array('sprintf', $sprintf_args);
      }
    }

  public function callMe() {
    return $this->get("message");
  }

  public function callMeToo($user) {
    return $this->get("greeting", $user);
  }
}

Furthermore, if you want to use a small translation script I did, you can simplify it furthermore. It uses a database, so it might not have so much flexibility as you're looking for. You need to inject it and the language is set in the initialization. Note that the text is automatically added to database if not present.

class Example {
  protected $translator;

  // Translator already knows the language to translate the text to
  public function __construct($Translator) {
    $this->translator = $Translator;
    }

  public function callMe() {
    return $this->translator("I'd like to be translated in your client's language.");
  }

  public function callMeToo($user) {
    return sprintf($this->translator("Hi %s, nice to meet you!"), $user));
  }
}

It could be easily modified to use a xml file or any other source for translated strings.

Notes for the second method:

  • This is different than your proposed solution since it is doing the work in the output, rather than in the initialization, so no need to keep track of every string.

  • You only need to write your sentences once, in English. The class I wrote will put it in the database provided it's correctly initialized, making your code extremely DRY. That's exactly why I started it, instead of just using gettext (and the ridiculous size of gettext for my simple requirements).

  • Con: it's an old class. I didn't know a lot back then. Now I'd change a couple of things: making a language field, rather than en, es, etc, throwing some exceptions here and there and uploading some of the tests I did.

1
RandomSeed On

The basic approach is to provide the consumer with some method to define a mapping. It can take any form, as long as the user can define a bijective mapping.

For example, Mantis Bug Tracker uses a simple globals file:

<?php
    require_once "strings_$language.txt";
    echo $s_actiongroup_menu_move;

Their method is basic but works just fine. Wrap it in a class if you prefer:

<?php
    $translator = new Translator(Translator::ENGLISH); // or make it a singleton
    echo $translator->translate('actiongroup_menu_move');

Use an XML file instead, or an INI file, or a CSV file... whatever format of your liking, in fact.


Answering your later edits/comments

Yes, the above does not differ much from other solutions. But I believe there is little else to be said:

  • translation can only be achieved through string substitution (the mapping may take an infinite number of forms)
  • formatting number and dates is none of your concern. It is the presentation layer's responsibility, and you should just return raw numbers (or DateTimes or timestamps), (unless your library's very purpose is localisation ;)