The Log4J AsyncAppender is not always asynchronous.
I discovered some interesting things about the Log4J AsyncAppender. If you've used Log4J, you've probably used it before - you usually configure all of your other appenders (that write to the console, to a log file, etc...) to attach to it, so that logging calls in your code don't have to wait for the appenders to do their stuff.
That's the ideal, but it doesn't always work that way. In short, there are two scenarios where the default configuration of the AsyncAppender will cause all logging to become slow. This is too bad: the default configuration of Log4J should not assume that logging is more important than performance. In every application I've ever written, given the choice between the app becoming extremely slow and logging events getting thrown out, I would choose the latter.
Here are the details:
For starters, the AsyncAppender has a max buffer size, set by default to only 128 events. That means that if your app logs events more quickly than an attached appender can keep up, the buffer will fill up quickly. This could happen easily if, for example, you configure an appender to write to a file on a shared disk, and that shared disk is slow. Or, if you attach an appender that does something more involved, like write messages to a web service.
By default, if this buffer fills up, then any logging call will block until the buffer's size goes below the max number of events. You can increase this number of events in your Log4J configuration. But more importantly, you can configure the AsyncAppender to throw out messages rather than block when its buffer fills up (by setting "blocking=false"). Why this isn't the default, I'm not sure. Wouldn't you rather have your application discard log messages rather than crawl to a halt when it can't handle the volume? In how many applications is logging more important than actually responding to the user?
You would think that setting blocking to false would prevent logging from bringing your application to its knees. But there's something else to look out for. If the thread spawned by the AsyncAppender dies, the AsyncAppender suddenly decides to become syncrhonous. Better might be trying to spawn a new thread, or just throwing out messages. What's worse, this behavior is not configurable (without modifying Log4j code). Why might that thread die? I'm not sure, but it was happening to me frequently recently, and I never tracked it down. I suspect one of the appenders was throwing an exception that wasn't caught by the AsyncAppender, but I'm not sure. The point is that the thread can die, and all of a sudden your app's performance will become dependent on whether your appenders can keep up with logging.
The solution? Spawn a thread in your appender, and only write events from that thread? Then you've got the problem that your appender's buffer may grow indefinitely, causing memory issues. Maybe the best solution is to write your own AsyncAppender and stop using the one provided by Log4J.
That's the ideal, but it doesn't always work that way. In short, there are two scenarios where the default configuration of the AsyncAppender will cause all logging to become slow. This is too bad: the default configuration of Log4J should not assume that logging is more important than performance. In every application I've ever written, given the choice between the app becoming extremely slow and logging events getting thrown out, I would choose the latter.
Here are the details:
For starters, the AsyncAppender has a max buffer size, set by default to only 128 events. That means that if your app logs events more quickly than an attached appender can keep up, the buffer will fill up quickly. This could happen easily if, for example, you configure an appender to write to a file on a shared disk, and that shared disk is slow. Or, if you attach an appender that does something more involved, like write messages to a web service.
By default, if this buffer fills up, then any logging call will block until the buffer's size goes below the max number of events. You can increase this number of events in your Log4J configuration. But more importantly, you can configure the AsyncAppender to throw out messages rather than block when its buffer fills up (by setting "blocking=false"). Why this isn't the default, I'm not sure. Wouldn't you rather have your application discard log messages rather than crawl to a halt when it can't handle the volume? In how many applications is logging more important than actually responding to the user?
You would think that setting blocking to false would prevent logging from bringing your application to its knees. But there's something else to look out for. If the thread spawned by the AsyncAppender dies, the AsyncAppender suddenly decides to become syncrhonous. Better might be trying to spawn a new thread, or just throwing out messages. What's worse, this behavior is not configurable (without modifying Log4j code). Why might that thread die? I'm not sure, but it was happening to me frequently recently, and I never tracked it down. I suspect one of the appenders was throwing an exception that wasn't caught by the AsyncAppender, but I'm not sure. The point is that the thread can die, and all of a sudden your app's performance will become dependent on whether your appenders can keep up with logging.
The solution? Spawn a thread in your appender, and only write events from that thread? Then you've got the problem that your appender's buffer may grow indefinitely, causing memory issues. Maybe the best solution is to write your own AsyncAppender and stop using the one provided by Log4J.

11 Comments:
Seems like a good case for using JavaSpaces!! See yesterday's article at TheServerSide Using JavaSpaces by Joseph Ottinger.
Kevin, your blog is the only reference I found on the Web to log4j AsyncAppender thread deaths. Were you able to resolve why this is happening?
I take it back, here's the Bugzilla entry which seems to explain the behavior you're seeing: http://issues.apache.org/bugzilla/show_bug.cgi?id=23021.
So the key is to ensure that downstream appenders catch every RuntimeException. Were you able to figure out where it was happening in your application?
Mark, sorry just noticed your comment. I ended up writing my own async appender. If I have time later on, I'll comment on how it differs from log4j's.
The resolution for the bug entry you reference does not fix the entire problem. It's still broken, IMHO, in 1.2.14.
Hello,
Regarding your interrogation:
"In how many applications is logging more important than actually responding to the user?"
This is true for non critical application (i.e many web site).
I work for Bank. In multiple cases, you need the assertion, in the logfile, that a certain event did actually occurred, in a synchronous way. This assertion is often preferred over the response time of the application.
Thank you for the information on the AsyncAppender
If the problem still exists in your tests, would you briefly raise an appropriate bug against log4j (based on this one)? Then we all stand a chance of seeing a fix.
Thanks.
Thanks for the clear explanation. This is the only place I've seen the issue detailed. Definitely saved me a bunch of time.
Any chance of submitting your appender to the community or posting it here as a starting point for those of us also in the same boat?
-Kam
This comment has been removed by a blog administrator.
This comment has been removed by a blog administrator.
This comment has been removed by a blog administrator.
This comment has been removed by a blog administrator.
<< Home