A GCJ-based servlet engine for Apache Mod_GCJ

Threading

A big issue to address when trying to embed libgcj code into apache is how to reconcile the thread-based model of Java with the multi-processing (MP) architecture of Apache. The default MP mode for Apache 2.0 on Unix platforms is the traditional preforked model. But even the threaded worker mode is actually a hybrid multi-threaded multi-process approach, so a purely threaded execution is not available with any of the major Apache MP modules on Unix.

The current mod_gcj code just ignores these threading issues, with the following implications:

In preforked mode, the Java VM is created when the module is initialized. When a process is forked, the Java VM is forked with it, resulting in one VM per process. Since the libgcj VM is comparatively slim and Linux and most modern Unixes do lazy copy-on-write copying of process memory, this works surprisingly well and may be an option for situations where performance/memory usage is not the paramount concern and having multiple VMs is not a problem. However, it's not the way to go if we want a solution that "feels right" for Java.

In worker/threaded mode, we run into problems because libgcj requires pthread functions to be overridden for the garbage collector to do its work. Thus, the current code does not work at all with the Apache worker MPM unless one changes the APR Threading functions to use the libgcj-provided threading functions (See this posting for possible solutions). Although getting a fix for this into Apache might be possible, I don't feel this would be the best solution either.

Dynamic Loading

Another big problem one faces when linking Apache with libgcj code is the way classes are loaded from dynamically linked libraries. libgcj does not like to see classes it already knows about in libraries. In fact it hates it so much that it throws an error that will cause the loading program to exit. Unfortunately, Apache has the habit of loading modules multiple times during startup and when reloading the configuration.

To overcome this I first tried to separate code into two separate libraries - one containing C++ code directly placed into the Apache modules directory, and one containing the Java code placed in /usr/lib. Unfortunately, this hack doesn't work 100% well either. The next thing I tried was to link the module statically into Apache. Alas, the loading troubles have gone, but requiring statical linking and thus recompiling Apache for mod_gcj is not a viable option for most real-world applications.

Solution

Of course, problems are here to be solved. I'm currently working on a design that will overcome these problems, yet sticks to the original goal of making embedding GCJ/Java into Apache easy, efficient and trouble-free.