In an attempt to get myself cheap remote backups over the internet, I bought a Raspberry Pi kit and set it up as a hackintosh Time Capsule by attaching my USB backup disk to the Pi. I however wanted to keep my existing backup history, so instead of using a fresh Linux-formatted partition (like a clever boy) I tried to get the Pi to use my existing HFS+ filesystem. Anyone interested in trying this should probably read about Linux’s flaky HFS+ user mapping and lack of journaling support first, and then back away very slowly. I blame this for all my subsequent problems.
After some effort I did get my aging Macbook to write a new backup on the Pi, but I couldn’t get it to see the existing backups on the drive. Apple uses hard links for deduplication of backups, and because remote filesystems can’t be guaranteed to support them it uses a trick. Remote backups are written not directly on the remote drive, but into a sparse disk image inside it. Thinking that it would be a relatively simple matter to move the old backups from the outer filesystem into the sparsebundle, I remounted the USB drive on the Mac (as Linux doesn’t understand sparsebundles, fair enough).
The Macbook first denied me the move, saying that the case sensitivity of the target filesystem was not correct for a backup – strange, because it had just created the sparsebundle itself moments before. Remembering the journaling hack I performed “repair disk” on both the sparsebundle and then the physical disk itself. At this point disk utility complained that the filesystem was unrecoverable (“invalid key length”) and the physical disk would no longer mount. In an attempt to get better debug information from the repair, I ran
fsck_hfs -drfy on the filesystem in a terminal. This didn’t help much with the source of the error, but I did notice that at the end it said “filesystem modified =1”. Running it again produced slightly different output, but again “filesystem modified =1”. It was doing something, so I kept going.
In the meantime, I had been looking into ways of improving the backup transfer speed over the internet. I originally planned to use a tunnel over openvpn, but this would involve channeling all backup traffic through my rented virtual server, which might not be so good for my bank account. I did some research into NAT traversal, and although the technology exists to allow direct connections between two NATed clients (libnice), I would have to write my own application around it and at this point I was getting nervous about having no backups for an extended period. I had also been working from home and getting frustrated with the bulk transfer speed between home and work, and came to the conclusion that my domestic internet connection couldn’t satisfy Time Machine’s aggressive and inflexible hourly backup schedule.
Six iterations of
fsck_hfs -drfy later, the disk repair finally succeeded and the backup disk mounted cleanly. At this point, I decided a strategic retreat was in order. I went to set up Time Machine on the old disk, but it insisted that there were no existing backups, saying “last backup: none”. Alt-clicking on the TM icon in the tray and choosing “Browse Other Backup Disks” showed however that the backups were intact. While I could make new backups and browse old ones, they would not deduplicate. As I have a large number of RAW photographs to back up, this was far from ideal. There is a way to get a Mac to recognise another computer’s backups as its own (after upgrading your hardware, for example) . However, it threw “unexpectedly found no machine directories” when attempting the first step. It appeared that not only did it not recognise its own backup, it didn’t recognise it as a backup at all.
After a lot of googling at 2am, it emerged that local Time Machine backups use extended attributes on the backup folders to store information relating to (amongst other things) the identity of the computer that had made the backup. In my earlier orgy of fscking, the extended attributes on my Mac’s top backup folder had been erased. Luckily, I still had the abandoned sparsebundle backup in the trash. Inside a sparsebundle backup, the equivalent metadata is stored not as extended attributes, but in a plist file. In my case, this was in
/Volumes/Backups3TB/.Trashes/501/galactica.sparsebundle/com.apple.TimeMachine.MachineID.plist, and contained amongst other bits and bobs the following nuggets:
These key names were a similar format to the extended attributes on the daily subdirectories in the backup, so I applied them directly to the containing folder:
$ sudo xattr -w com.apple.backupd.HostUUID XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX /Volumes/Backups3TB/Backups.backupdb/galactica
$ sudo xattr -w com.apple.backupd.ModelID MacBookPro5,1 /Volumes/Backups3TB/Backups.backupdb/galactica
After that was fixed, I could inherit the old backups and reassociate each of the backed up volumes to their master copies:
$ sudo tmutil inheritbackup /Volumes/Backups3TB/Backups.backupdb/galactica/
$ sudo tmutil associatedisk -a / /Volumes/Backups3TB/Backups.backupdb/galactica/Latest/Macintosh\ HD/
$ sudo tmutil associatedisk -a /Volumes/WD\ 1 /Volumes/Backups3TB/Backups.backupdb/galactica/Latest/WD\ 1/
The only problem arose when I tried to reassociate the volume containing my photographs. Turns out they had never been backed up at all. They bloody well are now.
So what happened to my plan to run offsite backups? I bought a second Time Machine drive and will keep one plugged in at home and one asleep in my drawer in work, swapping them once a week. This is known as the bandwidth of FedEx.
(H/T Josh McIntyre)