Snap Logger

The Snap! Logger Project

Snap Logger Logo Originally, this project was a break up of the log.cpp and log.h from the libsnapwebsites to its own project. However, that implementation had a dependency on log4cplus which has all sorts of drawbacks for our Snap! environment. Don't get me wrong, up to here, it served us well and log4cplus version 1.2.x works very well.

That being said, our way of handling configuration files, the fact that we like to use fork() and a few other things, made us think that having our own implementation of a logger would improve our general environment quite a bit. This newer version of the logger is the result.

WARNING: snaplog vs snaplogger

The snaplog was an attempt to send logs to one machine which can then congregates all the logs in one place using MySQL.

This page is about the snaplogger which is a lower level system which actually generates the logs and send them to appenders.

Download

You can download the source code from Github. However, it makes use of our snapcmakemodules, advgetopt, libutf8, snapdev, libexcept, cppthread, and snapcatch2 projects. All of which can be downloaded from Github.com if you checkout the snapcpp project with:

git clone --recursive git@github.com:m2osw/snapcpp.git snapcpp

Then run:

snapcpp/bin/build-snap --processor=8

(Set the number of processors to something that makes sense on your machine.)

One other benefit you'll get: you'll have access to many more libraries and tools in one go, even if you do not use the snapwebsites/* environment.

Note that we also offer binaries on launchpad.

So far our code compiles on Ubuntu 16.04 and 19.04, although we do not constantly verify 19.04 at the moment (we're slowly working on a way to build binaries on launchpad for all versions.)

As we move forward, though, we will quickly drop older versions because at this point we do not have a huge following and it takes too much time to maintain code to work with older compilers. We want to be able to use C++17 and possibly even C++20 in most of our code.

Reference

The documentation is available after a successful compilation. It gets extracted using Doxygen.

We at times load a copy in our References page here.

Features

Appenders

Just like log4cplus, we have appenders. These classes offer ways to write the logs to various destination. The base library supports the following:

Null Appender — the base appender is a null appender, if you directly use that one, nothing happens to the logs (it just gets dropped.)
File Appender — send the logs to a file; multiple such appenders can be added so you can send the same logs to any number of files
Syslog Appender — send the logs to the syslog facility; only one can be created (so is the sylog interface)
Console Appender — print to the console (stdout)
Buffer Appender — collect the logs in a buffer, quite useful to run our unit tests

The snaplog project will add a Network Appender which will allow for the logs to be sent over to a log service which in turn will log the data. Service which can be on the same computer or any other computer. One advantage of using a separate service is that the service can run as root and thus create logs for any user (i.e. www-data, snapwebsites, etc.)

Components

In the old days, you used to have two types of logs: Normal and Secure. Here we offer a feature called Components. This allows you to assign a log to groups (a.k.a. components). The logger then can verify which of the appender accepts which component and send the log to the appropriate destination.

We offer the NORMAL and SECURE components by default. You can add others at your discretion. We may add one or two more by the time we're considering the project as complete.

Severity Levels

The library offers 18 different levels by default. In most cases, we only use TRACE, DEBUG, INFORMATION, WARNING, ERROR, and FATAL. The severity mechanism offers up to 253 severity levels. 0 is reserved for ALL and 255 for OFF. The system levels cannot be changed. They all are separated by about 10 levels so you can easily add your own severity levels to the library. Dynamically adding severities can be done in code by creating severity objects or by editing the severify.conf file.

The severity is useful for two reasons: (1) you can let the user know how bad that issue was and (2) it is used by appenders to know whether a message makes it to the output or not. For example, if you have an ALL appender where logs from all your applications is sent, you may want to limit those messages to at least a WARNING level. So that appender is assigned a severity of WARNING. You can have another appender for that specific tool which accepts a lower level, possibly even uses ALL while your are working on your development.

In most cases, the existing levels are much more than enough.

Format & Variables

The format is build with characters and variables. Variables accept parameters.

The format references a variable using the dollar ($) character and the curly brackets ({}). Inside the curly bracket, you first enter the variable name, then you can follow that with paramaeters separated by colons. For example, the default format looks like this:

"${date} ${time} ${hostname} ${progname}[${pid}]: ${severity}:"
" ${message:escape} (in function \"${function}()\") (${basename}:${line})"

The ${date}, ${time}, and ${locale} are used to show the timestamp of the log.

The ${hostname}, ${hostbyname}, and ${domainname} are used to show the computer host and domain names or IP address.

Etc.

The ${message:escape} is an example where we use a parameter. In this case, the parameter is a function we apply to the message data. The "escape" function makes sure that any special character gets transformed to something that is much less likely to disturbe your logs. For example, it can convert the newline character to "\n" instead.

The ${message} is also recursive (only one which is at this point). If it itself include variables, they get converted. This means you messages should otherwise escape the dollar sign for safety. You can also prevent recursivity by adding the no_recursivity() flag to the message.

The ${progname} is a special diagnostic information. Contrary to the other parameters, this one is not directly available in the snaplogger. You are instead expected to enter that parameter using the map_diagnostic facility. Here is how this is done:

snaplogger::set_diagnostic(snaplogger::DIAG_KEY_PROGNAME, argv[0]);

Any number of such parameters can be added using the map diagnostic facility. For example, our libcppthread project adds the name of each thread it creates so that way we can easily see the name of the thread that generated the log.

Variable Extensibility

The system defines many variables already. However, your application may have special features that would allow for additional variables to be made available to your logs. If that's the case, then you can use DEFINE_VARIABLE(<name>) and implement the process_variable() virtual function. Voilà! You have added a variable.

We already support a pretty large number of system variables:

${date[:<part>]} — output date from message timestamp
${time[:<part>]} — output time from message timestamp
${locale[:<part>]} — output date and/or time from message timestamp using your locale
${env:name=<name>} — output the named environment variable
${severity[:format=alpha|numeric]} — output the message severity
${message} — output the message, note that we recursively process variables from this message (one level only)
${progname} — same as ${diagnostic:map=progname}
${version} — same as ${diagnostic:map=version}
${filename} — the name of the file where the log happened, including path
${basename} — the name of the file where the log happened, no path
${path} — the path where the file which generated the log is
${function} — the name of the function that generated the log
${line} — the line at which the log was generated
${diagnostic[:map=[<key>]|:nested[=<depth>]|:trace[=<depth>]] — print out a map, trace, or nested diagnostic (or a combo of those three); if map is specified without a key, all the keys are shown, otherwise that specific key is shown; when trace is used without a <depth>, all the trace message available are shown, otherwise at most <deph> messages are shown; when nested is used without a depth, all the nested diagnostics are shown, otherwise at most <depth> entries are shown
${hostname} — the name of the host that generated the log
${hostbyname=<name>} — the name of a host as defined by <name>
${domainname} — the name of the domain where the log was generated
${pid} — the process that generated the log
${tid} — the thread that generated the log
${threadname} — same as ${diagnostic:map=threadname#<tid>}
${uid} — the user identifier (numeric)
${username} — the user name, if available
${gid} — the group identifier (numeric)
${groupname} — the group name, if available
${direct:msg=<text>} — used internally to add text between variables, although you can always use it in your own formats

Note that with the functions, you get a really powerful and extensible format mechanism.

Functions Extensibility

Variables are followed by parameters. For example, you could use the running parameter along the pid: ${pid:running}, this will output the PID of the currently running process (opposed to the PID of the process that generated the log when just using ${pid}.)

Yet, you can use some parameters that are called functions. These parameters are used to format the output of the variable. For example, you can have a maximum width for your messages as in:

${message:escape:max_width=1000}

That way, messages of more than 1,000 characters get truncated. Such functions are defined using a C++ object with a virtual function, just like the variable extensions. We use the DECLARE_FUNCTION() in this case (see snaplogger/system_functions.cpp).

You can also define function parameters. For example, the padding parameter expects a character that is going to be used to pad fields that are too short. This padding parameter looks just like a function, but on its own it doesn't do anything. Instead, that character is saved in a map which later functions such as min_width reuse to pad the string as required.

Existing System Functions:

max_width=<size> — adjust the string so it is at most <size> characters.
min_width=<size> — adjust the string so it is at least <size> characters.
exact_width=<size> — adjust the string so it is exactly <size> characters.
append=<string> — append <string> to the value.
prepend=<string> — prepend <string> to the value.
escape[=<chars>] — list of characters to escape by adding a backslash in front of them.
caps — transform the first letter of each word to uppercase and the rest to lowercase
lower — force all the characters to lowercase
upper — force all the characters to uppercase

Existing Function Parameters:

padding=<char> — one character used to pad strings.
align=left|center|right — define the alignment when using the padding.

The padding and align values are used whenever the size is worked on with max_width, min_width, and exact_width. This is where the append and prepend can be useful. Here is an example with the PID:

${pid:prepend="+":padding=" ":align=right:exact_with=6}

This will transform PIDs with a + sign at the start and add padding spaces to the left of that. Note that the default padding character is the space so in this case we would not have had to change it.

The alignment can be left, right, or center. Be careful when using the exact_width or max_width since longer strings get silently truncated.

Note: the strings are viewed as Unicode. The input is expected to be valid UTF-8. When we talk about characters, these are Unicode characters (the library properly handles multibyte characters.)

Nested Diagnostics

In your blocks, you can add a nested_diagnotic object. This allows you to add keywords to your messages and have a better idea where you are in your code. Think of it as the function name with a stack trace capability. This is a very similar feature for logs.

Map Diagnostics

The Map Diagnostics are keyed messages which you can add in your logs. By default the system expects a few such entries to be created, such as the progname, threadname, and version.

It is possible to print out the entire map although in most cases you only want to show one entry as in: ${diagnostic:map=prognam} (${progname} is an equivalent).

Trace Diagnostics

We have a trace like feature to allow you to know where you're at by keeping a few of the last trace diagnostic calls. This is somewhat similar to the nested diagnostics, except it doesn't need to be stack based. It's just a continuation and it gives you information about the last N trace calls at the time a log happens.

Multithread Support

The library is multi-thread aware. As long as your do not attempt to create a thread until your main() function gets called, it will work as expected in a thread environment.

The library also makes use of our C++ Thread library to allow us to process logging messages asynchronously. Contrary to log4cplus, we can stop that thread at will so that way we can run fork() and not run into multithread issues.

Here are the very simple steps to use to make sure the thread does not break your environment when you fork:

logger::instance()->set_asynchronous(false);
fork();
logger::instance()->set_asynchronous(true);

Note: The variables are not currently recursive (this may be something we may want to implement later, but it did not seem useful at the moment.)

General Discussion About What to Log

Note that our snaplogger offers at least as many features as we've been using from log4cplus. However, in many cases, you do not really need very many features at all.

I think that the two main features you want to use are the log level and the formatting of your messages. Most everything else is just fluff.

The settings, the fact that you can send the logs to one or more files, a server, etc. is also useful for the administrator to organize where the logs go. However, in itself the important part about logs is what happens in the code.

First of all, many times, you do not want to leave a large amount of logs in your programs. This is because adding too many logs is rarely useful and it can also slow down your application. Whether the log is going to be saved to file or not, the message is going to at least be preprocessed. One, two, three, you'll be fine. Hundreds, thousands... now you're going to notice a slow down.

Further, what are logs? We offer many levels right out of the box: TRACE, DEBUG, INFO, NOTICE, WARNING, ERROR, FATAL... The truth is that in most cases you do not need that many.

The TRACE and DEBUG should probably be deleted once you are done debugging. I tend to leave debugs only in parts that are complex and still have some problems. But not so much anywhere else.

The INFO severity level is the one that makes sense.

The NOTICE and WARNING are probably not so useful. Was your log message an informative message or was it an error? Why not say exactly what it was instead of using a warning?

The ERROR sounds like a sensible choice. However, quite often, we make the mistake to log an error at that severity level because we just got an error. Ah but look at that, you have a plan B which bypasses the error case just fine. So that error should have been logged as an informational message instead. Here is an example of such a mistake:

f = fopen("/dev/hwrng", "r");
if(f == NULL)
{
    SNAP_LOG_ERROR << "could not open file..." << SNAP_LOG_SEND;
    f = fopen("/dev/random", "r");
    if(f == NULL)
    {
        SNAP_LOG_ERROR << "still could not open file..." << SNAP_LOG_SEND;
        return;
    }
}

In that example, the first message is not an error. It's informative. If both fopen() fail, though, now we have an error case.

The point here is that in many cases you really only need one main level: INFO. As the developper, you may still like to also have the DEBUG and ERROR levels. Using more than those three error levels is most certainly likely to give you headaches.

Changes

1.0.9 — added the ${diagnostic:trace=<depth>} variable
1.0.8 — added the trace diagnostic & tests pass on 18.04
1.0.7 — support the cppthread logger & new advgetopt exceptions
1.0.6 — fixed th caps function & added tests for better coverage
1.0.5 — changed "shredlog -R ..." to "shredlog -r ..."
- added a page about shredlog
1.0.4 — added a shredlog tool, fixed several bugs (installation, code)
1.0.3 — fixed the message destructor with an explicit call; many other fixes
- added many tests, increasing the usage closer to 100% (now at about 50%)
1.0.2 — severity handling improvements, added a fatal error
1.0.1 — removed log4cplus dependency, it's not used anymore!
1.0.0 — first version if our new logger.

Coverage Test Results

Access full page here.

The library has a test suite that covers 100% of the code, making it a little more certain that it does not include too many bugs. We try to run the tests each time we create a new version to ensure that it works as expected.

Snap! A C++ Open Source CMS