Archive for the ‘backup’ Category

It’s not about the encryption. It’s about the encryption keys

By

October 16th, 2014

There’s a lot of talk on the interwebs about encryption. Encryption is a necessary but not sufficient condition for maintaining control of your data. Controlling access to the encryption key is just as important.

Lots of articles that reference encryption fail to mention this, and that’s confusing for people who are not crypto experts. For example, a recent TechCrunch article about Edward Snowden and Dropbox paraphrases Snowden recommending SpiderOak because Dropbox “doesn’t support encryption.” In the very next paragraph it quotes Dropbox saying all files “are encrypted while traveling and at rest on Dropbox’s servers.” Then it says the difference between SpiderOak and Dropbox is that SpiderOak “encrypts the data while it’s on your computer, as opposed to only encrypting it ‘in transit’ and on the company’s servers.” It circles around the key issue but never says it explicitly.

When you read things like, “All files sent and retrieved from Dropbox are encrypted while traveling between you and our servers”, that’s good and it guards against eavesdropping in transmit, but it misses the point. “Encrypted” is meaningless if it can be decrypted.

It’s about who controls the keys. It’s about giving keys/control to a third party who can then be compelled to give control to a government agency. If you give your unencrypted content to a third party, you’ve lost control of the content, and that’s irreversible. But if you give your content to a third party in encrypted form and also give the third party the keys (as in the case of Dropbox), you’ve still irreversibly lost control of the content.

Keep the Keys to Yourself

Encrypt your data with a key that only you know. Then send your encrypted bits to the third party. The third party only has unreadable random noise (the encrypted data) and no way to turn it into files (decrypt it).

Arq (our backup app) has been designed from day one to make sure you keep the encryption key. It asks you at setup time for an encryption key, stores it securely in your computer’s keychain, and never transmits it anywhere. To restore files to a new computer, you’ll need to use the Arq app to decrypt and you’ll need to supply it with that encryption key, or else it can’t decrypt the data.

When you read about products that include encryption, always ask yourself who has the keys.

Amazon Glacier Pricing Explained

By

May 29th, 2014

Arq is our Mac backup app that backs up your files to your own Amazon Glacier account.

Backing up to Glacier is very popular because the storage cost is only $.01/GB per month! If you have, say, 100GB of files, it would only cost $1/month to store them in Glacier.

But people sometimes find Glacier pricing confusing, especially when it comes to restoring (downloading) your files. In this essay I hope to make the restore costs much clearer.

Glacier Retrieval Fees

For this example we’ll use the “US East” region glacier pricing (other regions cost slightly more).

Restoring a file from Glacier is a 3-step process. First you issue a Glacier restore request for an object. Then you wait approximately 4 hours for the object to become available for download. Then you download it.

Requesting that an object be made available for download costs $.05/1000 requests, and data transfer out of Glacier is $.12/GB with the first 1GB free each month.

You can restore up to 5% of your Glacier data for free each month, prorated daily. For example, if you’ve backed up 100GB of files to Glacier, you can restore 5GB for free each month — 160MB each day.

Data Restore Fee

If you exceed the 5% in a month, Amazon charges a data restore fee. This is where it gets confusing. Amazon’s description of the data restore fee is complex. Put more simply, the data restore fee is equal to the total size (in GB) of the object(s) requested, multiplied by $7.20, divided by 4 hours, minus the prorated 5% free tier.

There’s one other quirk about this data restore fee: It’s only incurred once for the entire billing month. If you request objects from Glacier and incur a data restore fee of, say $5, you could continue requesting objects from Glacier for the rest of the month at that rate (or slower), and the charge on your bill for that month will be $5. But if, during some other hour in that month, you request objects at a rate that equals a data restore fee of, say, $6, the charge at the end of the month will be $6.

Cost vs Speed

If you requested all 100GB of your Glacier data all at once, the data restore fee would be substantial! But in practice that isn’t realistic. For one thing, you probably can’t download 100GB of data very quickly. A 10 megabit/second ISP connection would allow downloading of about 3.6GB/hour; 100GB would take 27 hours to download, so there’s no need to ask Glacier to make all 100GB available in 4 hours.

The best way to download from Glacier is to request only what you can download in the next 4 hours, and repeat every 4 hours as necessary while you download objects as they become available for download.

Arq is an Amazon Glacier client that manages the Glacier retrieval cost for you. When you restore your files from Arq Glacier backups, you first select a transfer rate, and Arq’s Amazon Glacier calculator calculates the data restore fee (labeled “peak hourly request fee”) for you:

amazon glacier pricing example

If you change the download rate, Arq updates the cost estimates. Here we’ve changed from 686 KB/sec to 330KB/sec, and the data restore fee is cut in half:

glacier retrieval cost

 

When you click “Restore”, Arq begins requesting objects, and continues until it has requested the amount of data that would take 4 hours to download at the rate you’ve chosen. After 4 hours have elapsed, it begins requesting another 4 hours’  worth of objects, and simultaneously begins downloading objects that are becoming available. It continues this pattern until all the files have been downloaded.

You’re In Control

The great thing about Amazon Glacier is that you’re in control of your data. Your data are in your own Amazon account.

If/when it comes time to restore, you can choose the rate at which you want to restore your files, choosing a balance between speed and cost.

 

 

Arq 4 is out!

By

March 3rd, 2014

I’m really excited about this release! It’s got features that many people have been asking for, and it opens Arq up to a whole new range of options for storing backup data. It has undergone extensive testing and all identified issues have been fixed.

PLEASE NOTE: Every Arq 3 license purchased since December 1, 2013 has been upgraded for free to Arq 4.

ALSO: Arq 4 is for OS X 10.7 and later.

New Storage Options

For the first time, Arq can back up to not just Amazon Web Services. You can choose GreenQloud, DreamObjects, Google Cloud Storage, or any other S3-compatible target:

 Target types

You can even choose to back up to an SFTP server! If you have a NAS in your home or office that allows SSH/SFTP access, you can back up to that and pay $0 in monthly storage charges. Or back up to a VPS (virtual private server) like Dreamhost for cheap offsite backup.

Multiple Backup Targets

Not only can you back up to different types of “targets”; you can back up to more than one! You can choose to back up your files to multiple locations for redundancy, or back up some files to one location and others to another location. On different schedules. With different budgets. Mix and match to your heart’s content.

Multiple targets

More Control

 Several new features are aimed at providing you with more control over your backups:

  • Backup only on selected wireless networks — prevent upload when tethered to your phone, for instance
    Networks
  • Email notifications — great for monitoring headless/remote Macs and customer Macs 
    Email prefs
  • A unified budget across S3 and S3/Glacier backups (see “A New Approach to Glacier” below)
  • Optionally specify a “window” of time during the day/night when Arq pauses — useful for networks that are underutilized at night, and for ISPs that charge less at certain times of day
    Backup window

Other Features

Arq 4 includes several other features and improvements, including a “Stop Backup” function, the display of the last backup date in the agent’s menu, display of progress/total in the agent’s menu (so you don’t have to launch the app to check the backup’s progress), and less prompting for administrator privileges when restoring. Also, the process for setting up and restoring to a new computer is more straightforward.

A New Approach to Glacier

When Amazon announced their Glacier offering in the fall of 2012, we built Arq 3 to take advantage of it. Some time after Arq 3 was released, Amazon announced an S3 Glacier Lifecycle feature through which Amazon would automatically store certain S3 objects in Glacier. Arq 4 uses this new S3 Glacier Lifecycle feature for Glacier backups (existing Glacier backups made with Arq 3 will continue to use the old Glacier API as Amazon offers no way to move Glacier objects to S3). There are several benefits to using the S3 Glacier Lifecycle feature:

  • S3 objects with Glacier storage class have all the benefits of regular S3 objects — a known location/name, S3 object query abilities
  • No more creating Glacier vaults, which are hard to use and even harder to delete.
  • Restored objects are at known/expected locations, unlike restored Glacier objects which receive a random name.
  • Restored objects can be persisted for much longer, unlike restored Glacier objects which have a fixed 24-hour expiration.
  • Restarting a restore with Arq means only requesting restore of objects which haven’t been requested yet because previously-requested objects are at known locations.
  • Restoring is less complex and much faster — no more creating SQS queues and SNS topics; no more taking the time to read all messages from the SQS queue before beginning to download.
  • Budgeting is possible using the same logic as the S3 budget feature.

Changing to the New S3/Glacier Format

You can leave your Glacier backups configured as is, and Arq will continue to back up those folders the same way. You’re not required to change.

Unfortunately Amazon doesn’t provide a mechanism for creating “pointers” in S3 to existing Glacier objects, so if you want budgeting and easier restoring with Glacier, you’ll have to re-upload your files. I know this is far from ideal, but in the long term I believe it pays dividends in cost and ease of use. 

Arq no longer restricts you from adding the same folder twice, or adding a folder contained by a folder that Arq is already backing up. So, if you’ve been backing up a folder to Glacier with Arq 3, just add the folder to Arq again, and choose “Glacier storage class”:

Adding folder

Arq will back up the folder to S3 in a subdirectory of your S3 bucket called “glacier” which Amazon will automatically archive to Glacier storage class (Arq creates the Glacier lifecycle policy automatically). When it’s done uploading, delete the old folder (the old-style Glacier backups created with Arq 3).

Upgrade to Arq 4

For Arq 3 or Arq 2 users, Arq 4 is a $19.99 upgrade.

To upgrade, just delete your Arq app (you won’t lose any settings) and download and launch Arq 4: Download You’ll be prompted to upgrade your Arq 2 or Arq 3 license for $19.99.

(If you don’t want to upgrade right now, that’s no problem. We’ll continue supporting Arq 3 if you need support.)

To buy a new license, click here: Buy Now

- Stefan

Arq 4 Beta

By

February 4th, 2014

UPDATE March 3: Arq 4 is officially released.

 

I’m really excited about this release! It’s got features that many people have been asking for, and it opens Arq up to a whole new range of options for storing backup data.

PLEASE NOTE: Arq 4 should officially ship sometime before the end of February. Every Arq 3 license purchased on or after December 1, 2013 is eligible for free upgrade to Arq 4.

ALSO: Arq 4 is for OS X 10.7 and later.

New Storage Options

For the first time, Arq can back up to not just Amazon Web Services. You can choose GreenQloud, DreamObjects, Google Cloud Storage, or any other S3-compatible target:

 Target types

You can even choose to back up to an SFTP server! If you have a NAS in your home or office that allows SSH/SFTP access, you can back up to that and pay $0 in monthly storage charges. Or back up to a VPS (virtual private server) like Dreamhost for cheap offsite backup.

More Control

 Several new features are aimed at providing you with more control over your backups:

  • Backup only on selected wireless networks — prevent upload when tethered to your phone, for instance
    Networks
  • Email notifications — great for monitoring headless/remote Macs and customer Macs 
    Email prefs
  • A unified budget across S3 and S3/Glacier backups (see “A New Approach to Glacier” below)
  • Optionally specify a “window” of time during the day/night when Arq pauses — useful for networks that are underutilized at night, and for ISPs that charge less at certain times of day
    Backup window

Other Features

Arq 4 includes several other features and improvements, including a “Stop Backup” function, the display of the last backup date in the agent’s menu, and less prompting for administrator privileges when restoring. Also, the process for setting up and restoring to a new computer is more straightforward.

A New Approach to Glacier

When Amazon announced their Glacier offering in the fall of 2012, we built Arq 3 to take advantage of it. Some time after Arq 3 was released, Amazon announced an S3 Glacier Lifecycle feature through which Amazon would automatically store certain S3 objects in Glacier. Arq 4 uses this new S3 Glacier Lifecycle feature for Glacier backups (existing Glacier backups made with Arq 3 will continue to use the old Glacier API as Amazon offers no way to move Glacier objects to S3). There are several benefits to using the S3 Glacier Lifecycle feature:

  • S3 objects with Glacier storage class have all the benefits of regular S3 objects — a known location/name, S3 object query abilities
  • No more creating Glacier vaults, which are hard to use and even harder to delete.
  • Restored objects are at known/expected locations, unlike restored Glacier objects which receive a random name.
  • Restored objects can be persisted for much longer, unlike restored Glacier objects which have a fixed 24-hour expiration.
  • Restarting a restore with Arq means only requesting restore of objects which haven’t been requested yet because previously-requested objects are at known locations.
  • Restoring is less complex and much faster — no more creating SQS queues and SNS topics; no more taking the time to read all messages from the SQS queue before beginning to download.
  • Budgeting is possible using the same logic as the S3 budget feature.

Changing to the New S3/Glacier Format

Unfortunately Amazon doesn’t provide a mechanism for creating “pointers” in S3 to existing Glacier objects, so if you want budgeting and easier restoring with Glacier, you’ll have to re-upload your files. I know this is far from ideal, but in the long term I believe it pays dividends in cost and ease of use. 

Arq no longer restricts you from adding the same folder twice, or adding a folder contained by a folder that Arq is already backing up. So, if you’ve been backing up a folder to Glacier with Arq 3, just add the folder to Arq again, and choose “Glacier storage class”:

Adding folder

Arq will back up the folder to S3 in a subdirectory of your S3 bucket called “glacier” which Amazon will automatically archive to Glacier storage class (Arq creates the Glacier lifecycle policy automatically). When it’s done uploading, delete the old folder (the old-style Glacier backups created with Arq 3).

Beta Testing

The Arq 4 beta is available here: http://www.haystacksoftware.com/arq/Arq4beta.zip

Your Arq 3 license will work with it just fine.

The beta will expire March 17, but Arq 4 should be shipping by then. If not, a new beta version will be available.

Please submit your feedback, questions, and bug reports via Twitter @arqbackup or via email to support@haystacksoftware.com.

I look forward to getting your feedback!

- Stefan

Why Arq uses Amazon Web Services

By

October 21st, 2013

This is the story of how Arq came to use Amazon Web Services (AWS) for backing up files.

Time Capsule

Back in 2008 Apple announced their Time Capsule. I was really excited because I really wanted a backup solution that didn’t require me to remember anything (like periodically plugging in an external drive or making sure a NAS box was available). I also wanted a solution that I could control. Time Capsule seemed like perfect. I could set up Time Machine, it would back up to my Time Capsule whenever I’m at home, and I’d never have to think about it again.

The reality of Time Machine and Time Capsule wasn’t so wonderful in my case. Time Machine struggled to finish backing up most of the time, and had a lot of trouble with my habit of frequently closing my laptop lid and putting my Mac to sleep. I also couldn’t figure out how to make it “just work” for both my wife’s Mac and mine on the same Time Capsule. There were a number of other, smaller issues as well, like its inability to back up just the changed parts of my enormous mail file. I learned this was due to its design based on Unix hard links. The Time Capsule solution also didn’t provide any off-site protection in case all my computer equipment were stolen.

Online Backup in 2008

I looked at the commercial online backup solutions available at the time, and none of them really “felt” like backup because I didn’t have any control over the backup data. I wanted to be able to verify the backup data were really there and safe. And if the backups are on someone else’s hardware, they should be encrypted with a key that controlled by me, not the hardware owner. None of the available options had client-side encryption.

To me, backups need to feel “solid” and trustworthy. I couldn’t find a solution that felt solid and trustworthy enough for me.

So, I set about building Arq. 

Amazon S3

I chose Amazon S3 because it had a really nice API, was purported to be very stable and reliable, and was delivered by Amazon, a stable company that seemed to be in cloud computing for the long haul, so I could be reasonably sure my data would be around in the future. 

Back in 2009 when I had just started working on the first version of Arq, I would tell people about it at meetups around town, including the Amazon S3 costs. People usually reacted with something like, “So, that sounds like Mozy but much more expensive. Doesn’t sound like a great idea.” But it was something I really wanted, so I kept at it. (I had no idea it would become so popular. Apparently a lot of other people want a high-quality backup app with reliable storage options.)

Anyway, backup to Amazon S3 turned out to be a great solution I think. Whenever you have an internet connection you get backed up automatically. You never have to worry about running out of backup space because S3 is like an infinitely-large disk drive in the sky. And, unlike EC2, S3 has had almost zero downtime since I started using it in 2009 (the only downtime incident I could find reference to on the interwebs was 6 hours of downtime in 2008).

S3 can get expensive compared to the “unlimited” offerings like Carbonite, but what you get is a very simple, stable storage service with a simple API; Arq provides the backup function on top of it. It’s like a power company — they provide 120 volts of AC, all the time, and your appliances provide functionality on top of it. To me this model feels more solid, and backup needs to be solid.

Amazon Glacier

A year ago Amazon announced a new storage option called Glacier. It’s 1/10th the storage cost of S3, but it incurs fees if you retrieve your data, especially if you retrieve it rapidly. If you retrieve less than 5% of your stored data per month (pro-rated daily) there’s no fee; but if you retrieve lots of data all at once, the retrieval fees can add up. Arq supports backing up to Glacier, and when you restore using Arq it first asks at what rate you’d like to download and shows you the estimated fee for that rate. It’s especially suited to second-tier backup; if you’ve already got a local backup then the Glacier backup is just in case both your computer and your local backup fail. I back up my photos and music to Glacier (because it’s a lot of data) and everything else to S3 (because restore is faster/cheaper).

File Sync on S3

The other thing I’ve wanted for a while is a file sync solution that I control. Dropbox is an excellent solution, but I wanted control the same way I have with Arq. I wanted client-side encryption; I wanted the cloud data to “feel” solid and trustworthy; and I wanted total flexibility — no limits on file sizes, number of files, or total storage space. I basically wanted my own Dropbox system, running in my own AWS account. So I built that. It’s called Filosync, and I really love it. Give it a try if you’re looking for that sort of thing.

Arq 3.2 is out!

By

June 12th, 2013

Arq 3.2 is a free update for all Arq 3 users.

The big new feature in 3.2 is restoring onto an existing folder. While it looks like no big deal on the surface (it just asks whether you’d like to overwrite the existing folder or not), under the covers it was a big change in the restore process. Arq now asks you for permission to launch a helper program as “root”, the super user. The helper program needs to be “root” so that it can write files into folders that your regular user account may not have permission for. Arq compares the existing file contents to the backup record and only downloads files that are different or missing. Then it applies all the metadata correctly as before.

Full list of new features:

  • Restore into an existing folder of files, only downloading the files that are different or missing. Restore runs using administrator privileges to avoid permission issues.
  • Added a “pause on battery power” feature — set it in Arq’s preferences.
  • Added a “setthrottle” command-line option to change the transfer rate setting. This is useful for those who wish to change Arq’s throttle setting via a script.

Fixed Bugs:

  • Fixed an issue during very large Glacier restores where Arq would request too many items, which would then expire from AWS before Arq got a chance to download them.
  • Improved memory usage and performance during Glacier restore and S3 restore.
  • Fixed 10.6-related crashes.
  • Fixed broken throttling issue.
  • Removed AddressBook features (pre-populating name and email in crash report forms and in-app purchase) so that users aren’t asked to give Arq permission to access contacts.
  • Added explanation of command-line options to Help documents.

To get Arq 3.2, pick “Check for Updates” from Arq’s menu.

Arq (Cloud Backup for Mac) Adds Support for Amazon Glacier

By

November 6th, 2012

Back Up to Amazon Glacier

Arq now backs up to Amazon’s new Glacier service, and I’m really excited about it! Glacier storage is super-cheap — just $.01/GB per month!

With Glacier you can store 100GB for just $1/month! Or store a terabyte for just $10/month!

I got hundreds of emails and tweets asking for Glacier support. Turns out it’s a good option for some scenarios (even with the slow restore time and possible extra Amazon charges). People want to use it for big stuff like iPhoto libraries, videos, etc that get too expensive in S3. They use it as a secondary backup, so they don’t expect to actually restore unless their whole house burns down, taking their primary backup with it.

Arq’s been getting pretty popular with independent folks as well as corporate employees. One user described it recently as “the backup utility of choice for the geekier segment of the Mac community.” Glacier support makes Arq a good fit for even more people!

Retrieval Costs

Amazon has designed Glacier for “infrequent retrievals”, according to their FAQ. In the event you need to restore a significant amount of data from Glacier, Amazon may charge you an additional retrieval fee on top of the standard data-transfer-out charges. The formula for calculating this retrieval fee is complicated, but Arq helps figure it out for you. When you select an item to restore, Arq shows you the expected retrieval fee given the detected download rate:

Screen Shot 2012 11 05 at 9 52 02 AM

You can adjust the download rate to change the retrieval fee:

Screen Shot 2012 11 05 at 9 52 26 AM

For more details on Arq and Glacier, see the Arq product page.

Pricing

Arq 3 is still just $29 per computer. Upgrade from Arq 2 for just $15.

How to Upgrade from Arq 2 to Arq 3

To upgrade from Arq 2 to Arq 3, just pick “Check for Updates” from the Arq menu.

When Arq 3 launches, it’ll prompt you to upgrade your license.

Using Arq with IAM

By

August 23rd, 2012

This post is for system administrators who support Arq on multiple computers. If that’s you, please read on!

IAM and Arq

If you need to install Arq on many computers using the same S3 account but you don’t want Arq to see the other computers’ backup data, use Amazon’s IAM (Identity and Access Management) to restrict what Arq sees.

The easiest way to do this is as follows:

  1. Use your main keys to install and configure Arq on a computer.
  2. Quit Arq and quit Arq Agent.
  3. Create an IAM user and capture its access key ID and secret access key.
  4. Look in (home)/Library/Arq/config/app_config.plist for the localS3BucketName and localComputerUUID values.
  5. Set up an IAM user with a policy that allows full access only to /<localComputerUUID> in the localS3BucketName, as well as “ListBucket” access (see example IAM policy below).
  6. Open the Keychain Access app and change the “Arq S3″ entry’s Account and Password fields to the access key ID and secret access key of that IAM user.
  7. Launch Arq.

Example IAM Policy

For computer with the following values:

  • localS3BucketName = akiaiyuk3n3tme6l4hfa.comhaystacksoftwarearq
  • localComputerUUID = 32D9D7A2-3B3E-4BE7-B85B-0605AF24F570

the IAM policy would look like this:

{
 "Statement": [
   {
     "Sid": "Stmt1344522941209",
     "Action": [
       "s3:ListBucket"
     ],
     "Effect": "Allow",
     "Resource": [
       "arn:aws:s3:::akiaiyuk3n3tme6l4hfacomhaystacksoftwarearq"
     ],
     "Condition": {
       "StringLike": {
         "s3:prefix": "32D9D7A2-3B3E-4BE7-B85B-0605AF24F570/*"
       }
     }
   },
   {
     "Sid": "Stmt1344522997713",
     "Action": [
       "s3:*"
     ],
     "Effect": "Allow",
     "Resource": [
       "arn:aws:s3:::akiaiyuk3n3tme6l4hfacomhaystacksoftwarearq/32D9D7A2-3B3E-4BE7-B85B-0605AF24F570/*"
     ]
   }
 ]
}

The first part gives “s3:ListBucket” permission for the user’s bucket, but only with a prefix starting with 32D9D7A2-3B3E-4BE7-B85B-0605AF24F570/* (her UUID).

The second part gives permission for all actions for resources starting with akiaiyuk3n3tme6l4hfacomhaystacksoftwarearq/32D9D7A2-3B3E-4BE7-B85B-0605AF24F570/*.

Answer Files and IAM

For information on automating Arq configuration using answer files and IAM, please read the Arq manual’s Configuring Arq Using an Answer File section.

Arq plugin for Sidekick

By

April 28th, 2012

Arq Forum member jmah did some reverse-engineering of Arq and posted a message about a plugin he wrote for Sidekick which tells Arq to back up whenever he returns home.

The source code is on github.

Really clever! I love it.

Arq 2.6.9 is out

By

April 28th, 2012

Arq version 2.6.9 is now available!

This minor update fixes several minor issues, including the issue where some backup sets weren’t appearing under “Other Backup Sets”.

It’s a free update for all Arq users. Pick “Check for Updates” from the Arq menu to get the update.

As always, full release notes for all Arq versions are on the release notes page.