As promised, I’ll discuss the causes of this problem.
So, what made the combination of my growing error log and low disk space so insidious? At least one of my applications was actively spamming
~/.xsession-errors while I frantically tried to free up some disk space. The other issue was that each time any process ran into an error associated with low disk space, it would send a report to
The root cause of the problem: UI applications spamming the log via
STDERR. Yes, when UI applications write to
STDERR, the data are appended to
~/.xsession-errors. All applications should handle their errors! Now I am certainly not implying that my programming projects in school had no unhandled exceptions/errors, but these aren’t class assignments, these are applications like Firefox, Evince, and Thunar. Another problem is that X seems to not filter duplicate messages as evidenced by this recent snippet from my
Gkr-Message: secret service operation failed: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Gkr-Message: secret service operation failed: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken. Gkr-Message: secret service operation failed: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
So what is the state of this bug today? Well, it was a bug known to Red Hat at least as early as 2009, and it seems that they decided to mark it as WONT FIX upon Fedora 16 EOL. The funny thing is that size limits were enforced on log files about a decade ago! The reasoning back then was pretty solid for removing the limits. The process writing to
STDOUT/STDERR would receive a signal indicating that the file size has been exceeded, but most processes didn’t trap that signal so they would die with no way to recover state. A few options were discussed, including the creation of an app to monitor the data being sent to
~/.xsession-errors, and the use of a pseudoterminal to redirect
~/.xsession-errors. In the end, they decided to pipe
~/.xsession-errors without monitoring the data while eliminating the size limit to avoid the
SIGXFSZ being sent. Their rationale was that UI apps don’t usually write to
~/.xsession-errors is orders of magnitude smaller than other files in
/home. Clearly this is no longer the case.
I think the reasoning behind not using some app to monitor the data going to
~/.xsession-errors is still valid. I also think it very imprudent to just point
/dev/null as many users have. Straight nuking
~/.xsession-errors with a
cron or clobbering it by changing the write operator from
> are almost as bad. My solution of using
logrotate to manage
~/.xsession-errors is good – if I don’t say so myself, but I do – but it has an obvious flaw:
logrotate runs at a scheduled interval.
~/.xsession-errors daily. What if some application fills
~/.xsession-errors at a rate of 1 MBps, I have only 1 GB of free space, and
logrotate isn’t scheduled to run for another hour? In less than 17 minutes I’ll have no more free space. An obvious solution is to just increase the frequency which
logrotate runs. For my example, moving
/etc/cron.hourly would not work. On my system, no
/etc/cron.minutely exists, and while my system would not mind checking a few log files every minute, systems with many log files could slow down significantly. Also, making
logrotate run every minute would increase the likelihood of interfering with real-time applications by a factor of 1440 compared with daily execution!
Tomorrow I’ll try to formulate a solution which checks the size of
~/.xsession-errors each time it’s written to.