Mark Overmeer
Perl modules
Log::Report
Papers

Exceptions, Logging, and Translation

 
This article was published in the Summer 2013 edition of $foo, Perl Magazin, where it got translated into German.

In many ways, Perl is an impressive language. Is there any language with more syntax? Probably not: there are always many ways to do it, where other programming languages force their programmers into some selected concept. Although very powerful on the basic syntax, Perl lacks higher level language features to develop larger applications. For instance, we have no evolved exception or logging infrastructure in the core language.

Not in core, but we do have various alternatives on CPAN for these features. And that's a problem. When an application wishes to use different libraries, those may (they will) use incompatible exception and logging concepts. This limits the operational capability of the code. One of the reasons why there are so many implementation for the same task on CPAN, is that the frameworks used are incompatible.

This article tries to convince the reader that exception, logging and translation infrastructures of an application are closely related. Even that much, that they can best be implemented as one framework. The Log::Report module does that. This article attempts to convince you that it results in a convenient framework.

What's the problem?

When you develop a new application, your focus is on understanding the purpose of the program, the target to achieve. At start of the project, there is little understanding on the support libraries you will need and the environments where the application will run during its whole lifetime. Too late, you discover that the program needs to scale-up; that other people will have to run it (foreigners have to understand the error messages), that it has to run on regular basis (errors directed to syslog), that it needs to get integrated in a larger application (match other frameworks).

Even when your initial small program pulls-in sophisticated exception, logging or translation frameworks, then still you need luck that all of these three play well with the unknown future of the software. If not, you are in for a major rewrite. Rewrites are expensive.

To die or not to die?

What can be wrong with this common line of code?

  open IN, '<:encoding(utf8)', $fn
      or die "cannot read $fn: $!\n";

Nothing much to worry about when you have a small and interactively run program. But reality may show this code inadequate. For instance:

  1. The subroutine which produces this error was called by an other routine which may anticipate a problem with this file. So, you see code like this in the wild:

       my $data = eval { open_files @files };
       if($@ =~ m/cannot read (.*?): (.*)/)
       {   my ($errfn, $errtxt) = ($1, $2);
           ... error recovery ...
       }

    The eval/die pair is a very poor exception mechanism. When the open failed, a lot is known about the circumstances which are simply ignored by die(). For instance,

    • source file name and line-number
    • timestamp of the event
    • filename object (if used)
    • error code unstringified
    • untranslated error string (with gettext)
    • call stack, and
    • other parameters of the failing open().

    The main program may have an unexpected use for some of these facts. But they are lost.

  2. As already known by ancient systems, errors may have different implications. UNIX's syslog requires to state a severity of the error: emergency, alert, critical, error, etc... Useful for separating mild from serious problems. Needed to escalate the event to best handler.

    Perl's three levels (print, warn, die) is rather poor. Everyone invents his own debug level, like

      print "saving session\n" if $verbose;
      print "logout user $user\n" if DEBUG;

    The module Carp offers some additional functions: carp, confess, croak, and cluck. They do not add to the number of supported exception levels. It is not always clear whether carp is the better choice or warn.

    Is die() expressing the situation clearly enough? `Alert' may be a better choice, when we expect it to be reparable. `Critical' may be correct for obligatory configuration files, which keep the webserver from starting.

  3. Did you know there is a character-set problem with "die $fn"? Filenames are sequences of bytes, received from the Operating System. On some Operating Systems, you may know which character-set is used for filenames. On Windows it is UTF-16. On UNIX/Linux, the character-set may be different for each file-system, for each component of the path of the filename. In any case, no-one encodes or decodes the filenames into Perls internal UTF8!

    Now, when you use a filename inside an error string, it gets concatened with the other text. That may cause your filename bytes (an utf8 sequence received from the OS, but not labeled as Perl's internal utf8) to be encoded as if it is latin1 into utf8. Oops!

  4. If you need these error messages translated, for instance because your application has international users, you wil have to create translation tables, usually with the (GNU) gettext library.

    You end-up with code in this style:

      die sprintf(gettext("cannot read file %s: %s\n"), $fn, $!);

    The translation tables need strings which are static, do not have the variables filled-in yet. So do not write this:

      die gettext("cannot read file $fn: $!\n");  # No!

    Ugly, and hard to process in the eval/die way of handling exceptions. Perl is one of the few Open Source applications very hostile to translations.

    And now, try to get some of the error messages to go in Chinese to the webpage while other error messages go in German to syslog. Or maybe the same message needs to go in both directions at the same time!

Recuperating: error reporting for small, interactively run programs is simple. With larger programs, international applications, and daemons we need more. Modules you find on CPAN are either totally ignoring the complications of these applications use a framework. Those frameworks are often incompatible.

The Log::Report concept

The Log::Report distribution is the only framework to combine

  • translations
  • exceptions, and
  • logging

features into one.

For example,

   # tranditional use of translations
   die sprintf(gettext("cannot read file %s: %s\n"), $fn, $!);

   # in Log::Report
   fault __x"cannot read {file}", file => $fn;

Ok, lets dissect this line. The fault() function is one of a whole bunch: debug, trace, info, notice, warning, error, fault, alert, and panic. Some of these are more like warn(), and other more like die(). Info and notice are closest to print().

Most of these function names are directly taken from syslog(2), with small additions. New is fault(), which is an error originating from the Operating System, where error() is triggered internally. As you can see, the $! does not need to be specified explictly, because it gets appended by fault() automatically.

translations

Copied from the ideas of Locale::TextDomain, is way to use the translator. In stead of "%s", we use "{file}". The "%s" is required for printf(). But here, the "sprintf(gettext())" is integrated into a new function shortly named "__x()" --two underscores followed by an "x".

Yes, "__x()" is a weird name for a function. It has weird friends, like "__" (two underscores) The latter will cause translation as well, but not attempt to interpolate values (the names between curly braces). And there are a few more functions.

The standard gettext is designed for C, there is no need to stay in Perl on that same low level. You may encounter this kind of translations with gettext, with parameter order changes:

  # in the code
  sprintf(gettext("in %d minutes, read %s"), $min, $file);

  # the Dutch translation table
  msgid "in %d minutes, read %s"
  msgstr "lees %2$s na %1$d minuten"

The parameters may need to change order in the translations. The translators do not automatically get an explanation on the meaning of the first and second parameter. This leads to additional documentation lines in the program to help the translators understanding the error message.

With the curly braces syntax of Log::Report it gets simpler. The need for additional documentation for translators is much smaller, it's self documenting:

  # in the code
  __x"in {minutes} minutes, read {filename}", minutes => $min, file => $file

  # the Dutch translation table
  msgid "in {minutes} minutes, read {filename}"
  msgstr "lees {filename} na {minutes} minuten"

Log::Report goes one step beyond Locale::TextDomain: "__x()" will delay the translation of the text until it is being used. In Log::Report, __x() creates a Log::Report::Message object which contains the string to be translated and the parameters to be filled in.

  my $msg = __x"Hello, {who}!", who => 'World';
  print $msg;                  # "Hello, World!"

  # it's a lazy object, so;
  print ref $msg;              # Log::Report::Message
  print $msg->toString;        # "Hello, World!"
  print $msg->msgid;           # "Hello, {who}!"
  print $msg->valueOf('who');  # "World"

The string to be translated, is called the msgid. It is just a label in the translation tables, to find the counterpart. However, it is also used as default translation: if the tables cannot be found or the translation is empty.

exceptions

In the example, the fault() function wraps this message object up into a Log::Report::Exception object. That object captures the caller stack trace, the error codes $! and $?, and more. Those values may be useful in displaying the message later.

The exception is thrown up in the program hierarchy, back to caller of the subroutine. It gets caught by some (log) dispatcher where it will be handled. Fault (and error and panic) will die with the exception: they are fatal. For example info and warning, won't die after throwing their exception object up in the hierarchy. They also end-up at the dispatcher to be processed.

Structurally, this is built:

   Log::Report::Exception object, contains
      - exception level (reason for message)
      - caller stack
      - $!, $?
      - Log::Report::Message object, contains
         - string to be translated (msgid)
         - parameters to be filled in
         - textdomain
         - $"

logging

Who knows where the output and errors have to go to? Where is the overview on the whole application? In package main of course, not in the modules! So, in the Log::Report concept, the main component of your application starts dispatchers which process exceptions (containing the messages). The module stay ignorant.

This instruction in the main package of your program defines the way exceptions are send to the logs:

 dispatcher SYSLOG => 'my-log' # send exceptions to syslog
   , charset => 'iso-8859-1'   # explicit character conversions
   , locale  => 'nl_NL'        # overrule user's locale
   , accept  => 'INFO-';       # take priority INFO and higher

Above example registers an additional dispatcher to process exceptions. When there is more than one dispatcher, they will all receive all exceptions. This dispatcher will ignore exception with levels below INFO. Then, it (attempts to) translate all messages into Dutch (nl_NL) with character-set latin1. The 'reason' of the exception gets translated into syslog levels, which are configurable. To give a `shocking' example: the ERROR reason gets mapped to syslog's LOG_ERR level.

When the program starts, there is one default dispatcher, which is already instantiated like this:

  dispatcher PERL => 'default'
    , accept => 'NOTICE-';

So, when you add a SYSLOG dispatcher, you have both. The PERL dispatcher shows the message in the native language and charset of the user, and the SYSLOG shows the same message translated to Dutch.

In a daemon or website, you usually do not want the default output. After your daemon is fully initialized, you close the PERL dispatcher:

  dispatcher close => 'default';   # close by dispatcher name

It is useful to have errors to go to the screen and to syslog while the deamon gets initialized. You see that this framework easily switches between stand-alone daemon needs and command-line applications.

catching exceptions

Exception frameworks have ways to catch exceptions thrown by a called block of code. As we have eval() to catch die() in standard Perl, the exception frameworks use try() which is internally using eval.

There are many modules on CPAN to implement try(), like Try::Tiny. I will not explain those here. Log::Report has (of course) its own version of try(), which follows eval() closely:

  try { error __x"Help!" };
  if($@) { ... }  # there is an error

(Do not forget the semi-colon at the end of the try-block!) Tricky. __x() produces a message object, which is wrapped-up in an exception object via error(). The exception gets thrown and error() dies. The thrown exception is caught by try().

Actually, the try is implemented as a normal dispatcher: all existing dispatchers are set aside during the block, try() takes all incoming exceptions.

To puzzle you a bit more: the $@ is set to be the try dispatcher object (see Log::Report::Dispatcher::Try) which returns false in boolean context! Example of some things you can do:

  try { error };
  if(my $e = $@->wasFatal)
  {   # $e is a Log::Report::Exception object
      # recast exception, now not fatal
      $e->throw(reason => 'NOTICE');

      # select specific dispatcher
      $e->throw(to => 'syslog');
  }
  else # no fatal exception, when !$@
  {   # but still may have caught non-fatal exceptions
      $@->reportAll;

      # print warnings, info etc
      print $@->exceptions;
  }

exception classes

Try may catch different errors, which may need to be treated differently. There are various ways to separate them out. For instance,

  if($e->reason ne 'DEBUG') ...
  if($e->message->msgid =~ /open/) ...

Other exception modules (also in many languages) use the class hierarchy for exceptions. In those frameworks, you can create exception objects like 'Exception::Grammar::NoMatch'. In this case, your caught exception can be filtered with

  # not Log::Report
  if($e->isa('Exception::Grammar')) ...

The disadvantage of this strategy is that many classes need to be created, usually with long names. In Log::Report:

  try { error __x"Auch!", _class => 'test,pain' };
  if(my $e = $@->wasFatal)
  {   if($e->inClass('pain')) ...
      ...

Loading Log::Report

Each module needs to require Log::Report. When you use translations, you will have to specify a textdomain, which is the name of the translation tables. If you do not have translation tables, then you still can use the __x() and friends. You can add translations later.

   use Log::Report 'my-project';

On a per dispatcher basis, but also globally, you can add a run /mode/. When changed from NORMAL to DEBUG, you will get much more information for each of the errors. For instance, you will see caller stacks, like carp and confess. When your application is ready, you switch the mode back to NORMAL.

   use Log::Report 'my-project', mode => 'DEBUG';

or

  dispatcher mode => VERBOSE => 'ALL';

take what you need

You may not need all tree components of the Log::Report concept.

   # only interpolation (without translation tables found)
   print __x"opening files: {files}\n", files => \@files;
  
   # translation and interpolation (tables exist)
   print __x"\t\tPlease pay: {price}\n", price => 3.14;

   # exception without translation
   error "cannot read file $file";

   # logging without exceptions or translations
   my $syslog = dispatcher SYSLOG => 'syslog', ...;
   $syslog->log({}, ERROR => 'Hello, World!');

   # logging without exceptions
   $syslog->log({locale => 'de'}, ERROR => __x"Hello, World!");

Tips and tricks

Just some assorted ideas to complete the picture.

concatenate translatables

The message object is smart in delaying its serialization. Even in the following example, the $msg is still an object without being translated.

  my $msg = __x("Hello") . ", " . __x("World!");

In the translation table, "Hello" and "World!" are listed separately to be translated.

whitespace handling

The '__x' is also smart with respect to whitespace, shown is twice the same:

  print __x"\t\tPlease Login:\n";
  print "\t\t" . __x("Please Login:") . "\n";

In this case, print() will serialize the message object. In either case, the translation table will show "Please Login:".

translate formatting

As extension to the Locale::TextDomain features, you may include some formatting in the variables:

  print __x"to pay: {price%.2f} USD", price => $p;

The translation table may list for some language a rounding to whole currencies.

  "te betalen: {price%d} EUR"

showing arrays

Also invented, you can interpolate arrays. The __xn() can be used if the translation depends on one of the values:

  print __xn"opening file {file}"
          , "opening {nr} files: {files}"
          , scalar @files
          , file  => $files[0]
          , files => \@files, nr => scalar @files;

The third parameter decides whether the first or second translated string will be used. Here, files is an ARRAY which will be joined with $" between the elements (you may provide a _join parameter)

The same, but simpler:

  print __xn"opening file {files}"
          , "opening {_count} files: {files}"
          , @files             # <-- scalar context, becomes _count
          , files => \@files;  # join of 1 == 1

Template::Toolkit

See Log::Report::Extract::Template section DETAILS on how to use Log::Report to translate for TT2.

   [% loc("hi {n}", n => name) %]
   [% loc("msgid|plural", count, key => value, ...) %]
   [% INCLUDE
        title = loc("search results")
    %]

You need to add four lines of code to your program to get this to work. The 'loc' name is configurable.

Oops

This breaks (single quotes)

   print __x'hello';   # crash

Do you know why? The single quote is the old name-space separator, later renamed to '::'

Encapsulating die/warn/carp

When you try() code which does not use Log::Report's exceptions, then the messages will automatically be converted into exception objects. So in:

   try { confess "help!" }
   if(my $e = $@->wasFatal) { ... }

The exception object will contain "help!" as message (not object), but also has decoded the stack-trace and $!, cleanly into a Log::Report::Exception object.

Conclusion

Log::Report has much more to offer than discussed here, for instance to help you build translation tables and managing lexicons. More than promoting the use of the module, I hope the case is made for the integration of translations, logging, and exceptions into one framework.

This module is used in a few large projects, for instance the XML::Compile suite. Good programs contain an huge number texts which need to reach users, so a syntactically light abstract framework is very welcome.