Using threads in a server that uses fork() -- they don't mix well.

Today I started work to switch from using log4cplus with files directly to using the loggingserver. Not only that, it is using the newest version (on the edge!) which is 1.2.0-rc3. Up to here, no major problem.

However, the newer version forces you to use a version of the log4cplus library which is multi-threaded. This causes a major problem because the server makes use of fork() to create child processes each time a connection is made. There are several reason to do so, but there are a couple that I think are certainly the most important ones:

  1. Having a child allocating resources can allocate memory and other resources and generate leaks if they were threads instead. The Unix way is to create a process, allocate what you need, then exit the process. That way you 100% avoid the step of releasing all the allocated resources. The major one we have in our threads are plugins.
  2. Another important aspect, although a thread could just load all the needed resources once and thus be a lot faster to respond, it would also have one enormous draw back: we would have to make 100% sure that when we reuse a thread to manage another request that we have brand new objects with the proper paths. Not only that, any resource sharing needs to be guarded and we use tons of resources.

Although the two points can be circumvented, it would add an enormous burden which we think is not required.

All of that being said, we do make use of threads. At this point, I can think of the following places that make such uses:

  1. Listener thread, used to listen for the UDP messages (STOP, PING, etc.) -- used by the server and backend processes;
  2. Process thread, used when we want to send and receive data from a process (a form of popen() which gives you a "rw" capability);
  3. Snap website initializer which runs with a thread to send request to the server process;
  4. The log4cplus library.

Number (1) and (4) is the ones that we are the most concerned with because the snapserver, which also uses fork(), is in charge of those two or more threads.

Number (3) is also involved with the snapserver, but it won't directly interract with the main thread, so we're fine there.

At this point, we use number (2) in plugins which is fine because these don't need to worry since they do not fork(), except after an exec() which thus works fine in regard to threads.

At this point, number (1) seems to never have been a problem. It is most certainly because once we fork() a child, we never check anything in link with that thread. However, if (1) allocates memory just at the time we call fork(), it could very well have a lock which prevents the child from running properly. It really depends on how smart fork() really is.

However, number (4) blocks child processes if we do not first take down that thread in the server, then call fork() and finally re-establish the logger. That process is not required when we are not using the loggingserver, although it is dangerous to think that only that appender will make use of a thread. What happens is that threads do not get duplicated when you call fork(). Yet the log4cplus library will think that it has a running thread and it will try to kill it, waiting on it forever. The only way to bypass that problem is to turn off the logger before calling fork() and then reinstantiating when we are done with the fork and that in both, the child and the server...

Overall, however, it looks like having even one thread in a server that forks is totally unsafe. So we will probably look into removing thread (1) from the main server.

Source: Threads and fork(): think twice before mixing them.

Snap! Websites
An Open Source CMS System in C++

Contact Us Directly