.xsession-errors Consumed my Hard Drive! (cont.)

As promised, I’ll discuss the causes of this problem.

So, what made the combination of my growing error log and low disk space so insidious? At least one of my applications was actively spamming ~/.xsession-errors while I frantically tried to free up some disk space. The other issue was that each time any process ran into an error associated with low disk space, it would send a report to ~/.xsession-errors!

The root cause of the problem: UI applications spamming the log via STDOUT or STDERR. Yes, when UI applications write to STDIN or STDERR, the data are appended to ~/.xsession-errors. All applications should handle their errors! Now I am certainly not implying that my programming projects in school had no unhandled exceptions/errors, but these aren’t class assignments, these are applications like Firefox, Evince, and Thunar. Another problem is that X seems to not filter duplicate messages as evidenced by this recent snippet from my ~/.xsession-errors:

Gkr-Message: secret service operation failed: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
Gkr-Message: secret service operation failed: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.
Gkr-Message: secret service operation failed: Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.

So what is the state of this bug today? Well, it was a bug known to Red Hat at least as early as 2009, and it seems that they decided to mark it as WONT FIX upon Fedora 16 EOL. The funny thing is that size limits were enforced on log files about a decade ago! The reasoning back then was pretty solid for removing the limits. The process writing to STDOUT/STDERR would receive a signal indicating that the file size has been exceeded, but most processes didn’t trap that signal so they would die with no way to recover state. A few options were discussed, including the creation of an app to monitor the data being sent to ~/.xsession-errors, and the use of a pseudoterminal to redirect STDOUT/STDERR to ~/.xsession-errors. In the end, they decided to pipe STDOUT/STDERR to ~/.xsession-errors without monitoring the data while eliminating the size limit to avoid the SIGXFSZ being sent. Their rationale was that UI apps don’t usually write to STDOUT/STDERR, and ~/.xsession-errors is orders of magnitude smaller than other files in /home. Clearly this is no longer the case.

I think the reasoning behind not using some app to monitor the data going to ~/.xsession-errors is still valid. I also think it very imprudent to just point ~/.xsession-errors to /dev/null as many users have. Straight nuking ~/.xsession-errors with a cron or clobbering it by changing the write operator from >> to > are almost as bad. My solution of using logrotate to manage ~/.xsession-errors is good – if I don’t say so myself, but I do – but it has an obvious flaw: logrotate runs at a scheduled interval.

Consider logrotate checks ~/.xsession-errors daily. What if some application fills ~/.xsession-errors at a rate of 1 MBps, I have only 1 GB of free space, and logrotate isn’t scheduled to run for another hour? In less than 17 minutes I’ll have no more free space. An obvious solution is to just increase the frequency which logrotate runs. For my example, moving logrotate from /etc/cron.daily to /etc/cron.hourly would not work. On my system, no /etc/cron.minutely exists, and while my system would not mind checking a few log files every minute, systems with many log files could slow down significantly. Also, making logrotate run every minute would increase the likelihood of interfering with real-time applications by a factor of 1440 compared with daily execution!

Tomorrow I’ll try to formulate a solution which checks the size of ~/.xsession-errors each time it’s written to.

Red Hat bug report
Unix signals
logrotate stuff

.xsession-errors Consumed my Hard Drive!

If you’re a Windows or Mac user please ignore. Maybe Mac users should at least read this since they use XServer, but I have it on good authority that “Macs just work” so chances are your error logs won’t chew up all available space on your hard drive.

I recently installed Android Studio, QTCreator, Sublime Text 2, and probably some other developer tools so the increased disk usage didn’t raise any red flags. In hindsight, I should have noticed that there was no way I just installed 7 GB of new software. The proverbial poop hit the fan yesterday. I attempted to use my terminal’s auto-complete feature and received an error message informing me that a new file could not be created due to low disk space. I started deleting things in order of lowest priority + easiest to find. Unfortunately, my disk kept filling up! The ~/.xsession-error file was 7.01 GB!

I’ll skip to my solution and follow it with a detailed walk through where I address some specific errors that showed up in the log file upon startup.
note: <user> should be replaced with your username on your host.

Quick Solution

$ rm ~/.xsession-errors
# nano /etc/logrotate.d/xsession

Insert:

/home/<user>/.xsession-errors
{
rotate 2
size 1M
missingok
not if empty
compress
}

Full Solution
I first checked .xsession-error to make sure there wasn’t anything in there that might help me sort out the problem. First, I attempted to open it in nano.

$ nano ~/.xsession-error

This was a mistake. I already had no disk space, now my kernel was thrashing trying to swap stuff. Even if my disk had gobs of space, I think it would take a long time to open a 7 GB file. So, I took a quick look at the first few lines of the file.

$ more ~/.xsession-error

The timestamps indicated that the first few messages were months old so I tried to look at the last few lines of the file where the useful stuff usually resides.

$ tail ~/.xsession-error

I gave up after two minutes of no response. I don’t know the implementation details of tail, but I suspect it walks through the file to find the end. In fact, I can’t think of any other way it would do it. Anyways, if you know the line numbers you want to view, you can use sed, or if you know you want the last X bytes you can use dd to copy just the last X bytes of the file. I made the assumption that any errors unrelated to disk space would continue to be generated once I solved my disk space issue, and deleted the error file.

$ rm ~/.xsession-errors

Note that rm does not delete the file, it merely deletes the inode mapping, fortunately a quick reboot solves this problem since Linux will see no mapping exists, and reclaim the space. If you’re working on a machine that shouldn’t be rebooted, you will need to delete the file by inode using find. Here is a nice article with instructions for deleting by inode.

My disk space issues resolved, at least temporarily, I checked to see if the error file had been created on boot and then took a look at its contents.

$ ls -al
$ cat ~/.xsession-errors

After the basic, non-error stuff:

Xsession: X session started for [redacted] at [redacted]
...

I noticed something about gnome-keyring-daemon: insufficient process capabilities, unsecure memory might get used. It turns out this is a result of the user (me) not having the proper privileges to perform a CAP_IPC_LOCK on memory. I used a solution by iMBeCil on crunchbang.org:

$ nano ~/.profile

and then inserted at the end:

eval `/usr/bin/gnome-keyring-daemon --start --components=pkcs11,secrets,ssh,gpg`
export SSH_AUTH_SOCK
export GPG_AGENT_INFO

Any time you change something, check the effects of that change before you move on. I deleted .xsession-errors and rebooted. I opened a terminal and saw .xsession-errors had been created, so I opened it up and saw that the gkd errors went away, and as a bonus the

(clipit:2299): Gdk-CRITICAL **: IA__gdk_window_thaw_toplevel_updates_libgtk_only: assertion `private->update_and_descendants_freeze_count > 0' failed

error did not return either. I now had only non-error type stuff in .xsession-errors on startup, and it was only 445 B to boot! Here’s what it looked like:

Xsession: X session started for [redacted] at [redacted]
localuser:[redacted] being added to access control list
tint2 : nb monitor 1, nb monitor used 1, nb desktop 2
** Message: applet now removed from the notification area
** Message: applet now embedded in the notification area

Now the last thing I wanted to fix was the growth of .xsession-errors. Granted, it would take a lot of reboots to generate another 7 GB error file appending only 445 B each time, but startup is not the only operation that appends stuff to .xsession-errors. I was thinking “Why not use a cron to monitor .xsession-errors?” It turns out Linux already provides something to do this: logrotate. I created a configuration file for .xsession-errors:

# nano /etc/logrotate.d/xsession
/home/<user>/.xsession-errors
{
rotate 2
size 1M
missingok
not if empty
compress
}

This rotates two .xsession-errors files. When the current error file grows to 1 MB, logrotate compresses it, deletes the old compressed error file, and starts a new error file. Notice that I needed to be root to modify stuff in /etc/logrotate.d/.

This has worked well so far, though I’ve hardly tested the robustness of this solution. A discussion on the details of this problem and possible root-causes will follow.