The Importance of Metadata on the Mac

Your backup product/service should back up your data completely and correctly. This of course includes each file’s contents, but on the Mac it also includes many types of “metadata”. What are “metadata”? The general definition is “data about data”. In the case of Mac files, metadata include the file’s size, its modification date, its owner, who’s allowed to read and/or write it, who’s allowed to run it (if it’s a program), and many others.

Unfortunately, many online backup products, such as Mozy, Carbonite, Backblaze, CrashPlan, and Dropbox, don’t back up and restore all the metadata correctly. Fortunately, Nathan Gray, an independent researcher at Caltech, has created a tool called Backup Bouncer that tests every type of Mac metadata. I’ve run the Backup Bouncer tests through Mozy, Carbonite, Backblaze, CrashPlan and Dropbox, and the results range from perfect to really really bad; a table with links to Backup Bouncer results for those products is on the Arq product page.

Backup Scenarios

But are all the metadata types really important for your situation? Well, it depends. Let’s analyze a few scenarios:

Backing Up Photos and Videos

Let’s say you just want to back up your JPEG photo files and your AVI movie files from your home directory. These files typically won’t have any more than the bare minimum of metadata. They’ll be owned by you with default permissions. Any backup product should be able to restore those files well enough; even if they restore the files as owned by you with default permissions, they’ll end up (by coincidence) to be the same as the originals.

One caveat even in this simplest of scenarios is the modification dates of files. If the backup program you’re using can’t even restore modification and creation dates correctly (Backblaze, Mozy and Carbonite can’t), when you restore, it’ll look like all your files were just created. This can be confusing and/or disconcerting in the case of text files and word processing documents. For some files like photos and videos, the “date taken” is stored within the file itself so iPhoto shows the photo with the correct date, but in the Finder the dates will still be off.

Backing Up Applications

How about backing up applications? On the Mac, apps are actually “bundles”, which are folders with the “bundle bit” set. The folders have a fixed structure and contain the actual executable file as well as lots of resources (icons, configuration files, etc). The executable file must have “executable permission”; otherwise OS X silently refuses to run it.

So if you use Dropbox for backing up your applications, for example, and you restore a deleted app (or you use it for sync and sync an app to another Mac), Dropbox will restore the app’s files but won’t set the executable permission. When you double-click on the restored app, nothing happens!

If you’re sufficiently technical you might check the system log and see an error like this:

com.apple.launchd.peruser.501[144] ([0x0-0x36b36b].com.haystacksoftware.iphotosync[18955]): posix_spawn(“/Users/stefan/Dropbox/iPhotoSync.app/Contents/MacOS/iPhotoSync”, …): Permission denied

You might then look at the binary’s permissions using Terminal and see it’s not marked as executable:

$ ls -l iPhotoSync.app/Contents/MacOS/
total 2272
-rw-r--r--  1 stefan  staff  1161472 May 28 15:10 iPhotoSync

You’ll then set the binary’s permissions correctly to be able to start the app:

$ chmod +x iPhotoSync.app/Contents/MacOS/iPhotoSync
$ ls -l iPhotoSync.app/Contents/MacOS/
total 2272
-rwxr-xr-x  1 stefan  staff  1161472 May 28 15:10 iPhotoSync*

But that’s pretty geeky!

Backing Up Downloaded Files

When you download a file using Safari, by default it goes to your Downloads folder. When Safari puts it there, it adds an “extended attribute” called “com.apple.quarantine” to the file indicating it’s potentially unsafe because you downloaded it from the internet (more on extended attributes later). When you attempt to open it, you get a dialog like this:

Screen shot 2010-06-01 at 10.26.38 AM.png

When you click “Open”, OS X removes the “com.apple.quarantine” extended attribute from the app.

If you restore these files from a backup using software that doesn’t restore extended attributes, you won’t get this security warning. For more details on extended attributes, see below.

Backing Up Time Machine Backups

Time Machine uses extended attributes heavily. Here’s a list of the extended attributes on one of my Time Machine backup folders:

$ xattr 2010-05-11-080152/
com.apple.backup.SnapshotNumber
com.apple.backup.SnapshotVersion
com.apple.backupd.SnapshotCompletionDate
com.apple.backupd.SnapshotStartDate
com.apple.backupd.SnapshotState
com.apple.backupd.SnapshotType

This is the list of extended attributes on the “Macintosh HD” folder within that backup folder:

$ xattr 2010-05-11-080152/Macintosh\ HD
com.apple.backupd.SnapshotVolumeFSEventStoreUUID
com.apple.backupd.SnapshotVolumeLastFSEventID
com.apple.backupd.SnapshotVolumeUUID
com.apple.backupd.VolumeBytesUsed
com.apple.backupd.VolumeIsCaseSensitive
com.apple.metadata:_kTimeMachineNewestSnapshot
com.apple.metadata:_kTimeMachineOldestSnapshot

Restoring these files from backup using a backup program that doesn’t restore extended attributes isn’t going to work too well; Time Machine wouldn’t find its expected metadata.

Backing Up Alias Files

Alias files are a somewhat esoteric feature of OS X. If you control-click a file in Finder and choose “Make Alias” from the pop-up menu (or Command+Option+drag the file to another location), you’ll get an alias file that points to the file you had clicked on. The alias acts as a stand-in for the original, and works even if you move the original to another location (on the same filesystem).

Alias files depend on “resource forks” (a type of extended attribute) to function properly. If you restore an alias file from backup using a backup program that doesn’t restore extended attributes, the restored file will be incomplete and unusable. For example, I made an alias to a PDF file in my Dropbox, waited for it to back up, deleted it, and then restored it via Dropbox. When I double-clicked on the restored PDF alias, it opened in TextEdit and looked like this:

Screen shot 2010-04-27 at 8.31.06 AM.png

That’s not usable.

Backing Up Symbolic Links

symbolic link is a special type of file that refers to another file. Some backup programs are unable to differentiate between a regular file and a symbolic link; they “follow” the link instead of backing up the link itself. So if you restore a folder tree containing a large directory as well as a symlink to that directory, you’ll instead get 2 copies of the large directory.

Some sync programs like Dropbox consider this a feature: even though Dropbox will only backup/sync the “Dropbox” folder, you can place symbolic links within it that point to other outside folders, and those will be backed up as well.

In my opinion a real backup program should accurately back up and restore the contents of your disk, not its interpretation of those contents.

If you’re backing up data from an application that depends on having symbolic links to a file instead of multiple copies of a file, failing to recreate the symbolic links could be a big problem. CrossOver is one program that creates many symbolic links and some complex ownership settings as well.

Backing Up Files with Finder Flags

OS X can store many “flags” with a file. The following flags can be read and written through the “Get Info” window in the Finder (control-click a file and choose “Get Info” from the pop-up menu):

Additional Finder flags include:

  • alias file
  • custom icon
  • has bundle
  • invisible
  • busy

“Has bundle” is a bit that makes your iPhoto Library look like a file instead of a folder. So if your backup program can’t correctly restore Finder flags, when you restore your iPhoto Library it’ll look like a regular folder. You can’t double-click on that to open it in iPhoto. You’ll have to launch iPhoto with the Option key held down and navigate to that folder; after you do this once, iPhoto resets the “has bundle” bit on the folder. You’ll find the same thing with a Pages document or a Numbers document, or any bundle that’s meant to look like a single file in the Finder.

“Invisible” is a bit that makes a folder invisible to the Finder. I’m not sure which apps (if any) use it.

“Custom icon” is a bit that Safari sets on a file while it’s being downloaded (the Finder shows a progress bar on the file icon).

“Busy” is a bit that indicates a file is busy or incomplete.

If your backup program can’t restore these flags, things may look a bit different when/if you have to restore from backup.

Backing Up Files With Creator Codes and Type Codes

As described in this TidBITS article, creator codes and type codes are 4-letter codes attached to a file that specify what type of file it is and which application created it. Before OS X this was the way MacOS figured out which application to open when you double-clicked on a file. When OS X arrived (with its Unix base), a conflict arose between using creator codes and using filename extensions.

As of Snow Leopard (OS X 10.6) creator codes are completely ignored for opening files. So whether you should care that your backup program correctly restores creator codes and type codes depends on whether you use 10.6, among other things.

OS X still uses creator code and type code for copying files however, even in 10.6. If you copy a large file from one place to another, the incomplete copy receives a type code of “brok” and creator code of “MACS”; Finder resets these to empty values when the copy is completed.

More on Extended Attributes

Extended attributes are small pieces of metadata associated with files and folders. In the “old days” (before OS X) MacOS used “resource forks” heavily; in OS X, a resource fork is just one type of extended attribute.

Using Terminal you can see which files have extended attributes (they get a ‘@’ next to the permissions):

$ ls -l Knox-2.0.1.zip
-rw-r--r--@ 1 stefan  staff  5210452 May 25 15:37 Knox-2.0.1.zip

The ‘xattr’ utility lists the extended attributes of a zip file I downloaded from the internet:

$ xattr Knox-2.0.1.zip
com.apple.metadata:kMDItemWhereFroms
com.apple.quarantine

In this example, the file was downloaded with Safari. The “com.appl.metadata:kMDItemWhereFroms” extended attribute contains where the file came from (in a binary format readable by OS X):

$ xattr -pl com.apple.metadata:kMDItemWhereFroms Knox-2.0.1.zip
com.apple.metadata:kMDItemWhereFroms:
00000000  62 70 6C 69 73 74 30 30 A2 01 02 5F 10 3B 68 74  |bplist00..._.;ht|
00000010  74 70 3A 2F 2F 61 77 73 2E 63 61 63 68 65 66 6C  |tp://aws.cachefl|
00000020  79 2E 6E 65 74 2F 61 77 73 2F 64 6D 67 2F 4B 4E  |y.net/aws/dmg/KN|
00000030  4F 58 2F 45 6E 67 6C 69 73 68 2F 4B 6E 6F 78 2D  |OX/English/Knox-|
00000040  32 2E 30 2E 31 2E 7A 69 70 5F 10 26 68 74 74 70  |2.0.1.zip_.&http|
00000050  3A 2F 2F 61 67 69 6C 65 77 65 62 73 6F 6C 75 74  |://agilewebsolut|
00000060  69 6F 6E 73 2E 63 6F 6D 2F 64 6F 77 6E 6C 6F 61  |ions.com/downloa|
00000070  64 73 08 0B 49 00 00 00 00 00 00 01 01 00 00 00  |ds..I...........|
00000080  00 00 00 00 03 00 00 00 00 00 00 00 00 00 00 00  |................|
00000090  00 00 00 00 72                                   |....r|

The “com.apple.quarantine” attribute is an indicator that the file was downloaded by Safari:

$ xattr -pl com.apple.quarantine Knox-2.0.1.zip
com.apple.quarantine: 0000;4bfcfa48;Safari;A17E9A1C-F662-4DF0-95AA-18F44791DAFC|com.apple.Safari

What To Do

If possible, choose a backup program for your Mac that correctly backs up everything. Then you don’t have to worry. Arq is one backup program that backs up and restores everything correctly.

6 Comments »

  1. The single most important pieces of metadata from my point of view is the OpenMeta tag. As of 9 March 2010 only JungleDisk was reported as preserving OpenMeta tags, albeit incompletely: they are said to work with backup but not with sync. Not one of the alternatives (DropBox and Backblaze, of those I use) was of any use in this respect. How, specifically, does Arq fare? I otherwise much prefer Arq but will commit to the solution which is most OpenMeta-friendly. Thanks in advance for a response.

    Comment by ashkenaz — June 24, 2010 @ 11:07 pm

  2. Nice article! I also recommend the backup program SuperDuper!, which is what I use to make a bootable backup of my boot drive.

    Comment by Anonymous — August 10, 2010 @ 5:07 pm

  3. SuperDuper is a different product. It clones your hard drive and it doesn’t work online. Arq saves versioned backups (like Time Machine) and backs up to Amazon’s cloud storage solution. If somebody steals your computer and the drive you’ve been backing up to with SuperDuper, all your data are gone.

    Comment by Stefan Reitshamer — August 10, 2010 @ 5:10 pm

  4. Hi ashkenaz,
    I’m really sorry it took me so long to get around to moderating the comments on this post — I must have missed the email notification on my end.

    Arq will back up all extended attributes (as well as every single other piece of file data and metadata on the Mac) so it will back up and restore OpenMeta tags since they’re stored as extended attributes (as far as I can tell).

    - Stefan

    Comment by Stefan Reitshamer — August 10, 2010 @ 5:17 pm

  5. If somebody steals your computer, starts Arq and deletes all your backups, all your data are gone?!

    - Siegfried

    Comment by Siegfried — August 10, 2010 @ 5:50 pm

  6. Well, they’d have to be able to log in as you in order to start Arq as you.
    In the meantime, if your computer is stolen you can go to http://aws.amazon.com/s3/ and inactivate the S3 credentials that your computer has.

    Comment by Stefan Reitshamer — August 10, 2010 @ 8:21 pm

RSS feed for comments on this post.

Leave a comment