I have a small virtual server in Linode that I use for various public-facing things such as serving web pages, a small Debian repository, email etc. Unfortunately it has a bad habit of filling up /var/log – I did say it was small.
I noticed today a lot of space being used up in `/var/log/ksymoops`, every few seconds and all of the form:
xen:~# tail -f /var/log/ksymoops/20180502.log
20180502 183837 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183837 probe ended
20180502 183842 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183842 probe ended
20180502 183842 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183842 probe ended
20180502 183842 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183842 probe ended
20180502 183842 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183842 probe ended
20180502 183847 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183847 probe ended
20180502 183847 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183847 probe ended
20180502 183847 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183847 probe ended
20180502 183847 start /sbin/modprobe -q -- net-pf-0 safemode=1
20180502 183847 probe ended
What’s more, these log files are not subject to logrotate by default, so I have them going back years, for as long as the VM has existed. While this is somewhat concerning, it does allow me to work out what happened.
The ksymoops logfiles are relatively sparse (a few lines every month or so) up until the date last year that I dist-upgraded the server from Debian 8 to 9. Then the dam broke and it has been throwing a batch of four errors like the above every five seconds since, day and night. Something somewhere is calling modprobe in a very tight loop, and it appears to be related to a package upgrade.
I managed to track down the offending process by replacing /sbin/modprobe with a script that called `sleep 100000` and running `ps axf`. It turns out that it had been called directly from one of the kworkers. If all four kworkers (2xHT) are doing the same thing, that would explain the pattern of errors.
Remember that this is in Linode – their VMs run under Xen paravirtualisation with non-modular kernels. Which means that modprobe, lsmod etc. have no effect – they crash out with the following error if you try to use them:
xen:~# lsmod
Module Size Used by Not tainted
lsmod: QM_MODULES: Function not implemented
This is only to be expected, as the VM’s modutils won’t talk to the Xen para-v kernel. But why is a para-v kernel worker triggering modprobe, when failure is guaranteed?
Luckily, I don’t have to care.
ksymoops is non-optional, but it can only write its logfiles if `/var/log/ksymoops` exists. The solution is simple.
rm -rf /var/log/ksymoops
Peace and quiet.