Archive for the ‘amazon’ Category

Amazon Glacier Pricing Explained

By

May 29th, 2014

Arq is our Mac backup app that backs up your files to your own Amazon Glacier account.

Backing up to Glacier is very popular because the storage cost is only $.01/GB per month! If you have, say, 100GB of files, it would only cost $1/month to store them in Glacier.

But people sometimes find Glacier pricing confusing, especially when it comes to restoring (downloading) your files. In this essay I hope to make the restore costs much clearer.

Glacier Retrieval Fees

For this example we’ll use the “US East” region glacier pricing (other regions cost slightly more).

Restoring a file from Glacier is a 3-step process. First you issue a Glacier restore request for an object. Then you wait approximately 4 hours for the object to become available for download. Then you download it.

Requesting that an object be made available for download costs $.05/1000 requests, and data transfer out of Glacier is $.12/GB with the first 1GB free each month.

You can restore up to 5% of your Glacier data for free each month, prorated daily. For example, if you’ve backed up 100GB of files to Glacier, you can restore 5GB for free each month — 160MB each day.

Data Restore Fee

If you exceed the 5% in a month, Amazon charges a data restore fee. This is where it gets confusing. Amazon’s description of the data restore fee is complex. Put more simply, the data restore fee is equal to the total size (in GB) of the object(s) requested, multiplied by $7.20, divided by 4 hours, minus the prorated 5% free tier.

There’s one other quirk about this data restore fee: It’s only incurred once for the entire billing month. If you request objects from Glacier and incur a data restore fee of, say $5, you could continue requesting objects from Glacier for the rest of the month at that rate (or slower), and the charge on your bill for that month will be $5. But if, during some other hour in that month, you request objects at a rate that equals a data restore fee of, say, $6, the charge at the end of the month will be $6.

Cost vs Speed

If you requested all 100GB of your Glacier data all at once, the data restore fee would be substantial! But in practice that isn’t realistic. For one thing, you probably can’t download 100GB of data very quickly. A 10 megabit/second ISP connection would allow downloading of about 3.6GB/hour; 100GB would take 27 hours to download, so there’s no need to ask Glacier to make all 100GB available in 4 hours.

The best way to download from Glacier is to request only what you can download in the next 4 hours, and repeat every 4 hours as necessary while you download objects as they become available for download.

Arq is an Amazon Glacier client that manages the Glacier retrieval cost for you. When you restore your files from Arq Glacier backups, you first select a transfer rate, and Arq’s Amazon Glacier calculator calculates the data restore fee (labeled “peak hourly request fee”) for you:

amazon glacier pricing example

If you change the download rate, Arq updates the cost estimates. Here we’ve changed from 686 KB/sec to 330KB/sec, and the data restore fee is cut in half:

glacier retrieval cost

 

When you click “Restore”, Arq begins requesting objects, and continues until it has requested the amount of data that would take 4 hours to download at the rate you’ve chosen. After 4 hours have elapsed, it begins requesting another 4 hours’  worth of objects, and simultaneously begins downloading objects that are becoming available. It continues this pattern until all the files have been downloaded.

You’re In Control

The great thing about Amazon Glacier is that you’re in control of your data. Your data are in your own Amazon account.

If/when it comes time to restore, you can choose the rate at which you want to restore your files, choosing a balance between speed and cost.

 

 

Why Arq uses Amazon Web Services

By

October 21st, 2013

This is the story of how Arq came to use Amazon Web Services (AWS) for backing up files.

Time Capsule

Back in 2008 Apple announced their Time Capsule. I was really excited because I really wanted a backup solution that didn’t require me to remember anything (like periodically plugging in an external drive or making sure a NAS box was available). I also wanted a solution that I could control. Time Capsule seemed like perfect. I could set up Time Machine, it would back up to my Time Capsule whenever I’m at home, and I’d never have to think about it again.

The reality of Time Machine and Time Capsule wasn’t so wonderful in my case. Time Machine struggled to finish backing up most of the time, and had a lot of trouble with my habit of frequently closing my laptop lid and putting my Mac to sleep. I also couldn’t figure out how to make it “just work” for both my wife’s Mac and mine on the same Time Capsule. There were a number of other, smaller issues as well, like its inability to back up just the changed parts of my enormous mail file. I learned this was due to its design based on Unix hard links. The Time Capsule solution also didn’t provide any off-site protection in case all my computer equipment were stolen.

Online Backup in 2008

I looked at the commercial online backup solutions available at the time, and none of them really “felt” like backup because I didn’t have any control over the backup data. I wanted to be able to verify the backup data were really there and safe. And if the backups are on someone else’s hardware, they should be encrypted with a key that controlled by me, not the hardware owner. None of the available options had client-side encryption.

To me, backups need to feel “solid” and trustworthy. I couldn’t find a solution that felt solid and trustworthy enough for me.

So, I set about building Arq. 

Amazon S3

I chose Amazon S3 because it had a really nice API, was purported to be very stable and reliable, and was delivered by Amazon, a stable company that seemed to be in cloud computing for the long haul, so I could be reasonably sure my data would be around in the future. 

Back in 2009 when I had just started working on the first version of Arq, I would tell people about it at meetups around town, including the Amazon S3 costs. People usually reacted with something like, “So, that sounds like Mozy but much more expensive. Doesn’t sound like a great idea.” But it was something I really wanted, so I kept at it. (I had no idea it would become so popular. Apparently a lot of other people want a high-quality backup app with reliable storage options.)

Anyway, backup to Amazon S3 turned out to be a great solution I think. Whenever you have an internet connection you get backed up automatically. You never have to worry about running out of backup space because S3 is like an infinitely-large disk drive in the sky. And, unlike EC2, S3 has had almost zero downtime since I started using it in 2009 (the only downtime incident I could find reference to on the interwebs was 6 hours of downtime in 2008).

S3 can get expensive compared to the “unlimited” offerings like Carbonite, but what you get is a very simple, stable storage service with a simple API; Arq provides the backup function on top of it. It’s like a power company — they provide 120 volts of AC, all the time, and your appliances provide functionality on top of it. To me this model feels more solid, and backup needs to be solid.

Amazon Glacier

A year ago Amazon announced a new storage option called Glacier. It’s 1/10th the storage cost of S3, but it incurs fees if you retrieve your data, especially if you retrieve it rapidly. If you retrieve less than 5% of your stored data per month (pro-rated daily) there’s no fee; but if you retrieve lots of data all at once, the retrieval fees can add up. Arq supports backing up to Glacier, and when you restore using Arq it first asks at what rate you’d like to download and shows you the estimated fee for that rate. It’s especially suited to second-tier backup; if you’ve already got a local backup then the Glacier backup is just in case both your computer and your local backup fail. I back up my photos and music to Glacier (because it’s a lot of data) and everything else to S3 (because restore is faster/cheaper).

File Sync on S3

The other thing I’ve wanted for a while is a file sync solution that I control. Dropbox is an excellent solution, but I wanted control the same way I have with Arq. I wanted client-side encryption; I wanted the cloud data to “feel” solid and trustworthy; and I wanted total flexibility — no limits on file sizes, number of files, or total storage space. I basically wanted my own Dropbox system, running in my own AWS account. So I built that. It’s called Filosync, and I really love it. Give it a try if you’re looking for that sort of thing.

Arq 3.2 is out!

By

June 12th, 2013

Arq 3.2 is a free update for all Arq 3 users.

The big new feature in 3.2 is restoring onto an existing folder. While it looks like no big deal on the surface (it just asks whether you’d like to overwrite the existing folder or not), under the covers it was a big change in the restore process. Arq now asks you for permission to launch a helper program as “root”, the super user. The helper program needs to be “root” so that it can write files into folders that your regular user account may not have permission for. Arq compares the existing file contents to the backup record and only downloads files that are different or missing. Then it applies all the metadata correctly as before.

Full list of new features:

  • Restore into an existing folder of files, only downloading the files that are different or missing. Restore runs using administrator privileges to avoid permission issues.
  • Added a “pause on battery power” feature — set it in Arq’s preferences.
  • Added a “setthrottle” command-line option to change the transfer rate setting. This is useful for those who wish to change Arq’s throttle setting via a script.

Fixed Bugs:

  • Fixed an issue during very large Glacier restores where Arq would request too many items, which would then expire from AWS before Arq got a chance to download them.
  • Improved memory usage and performance during Glacier restore and S3 restore.
  • Fixed 10.6-related crashes.
  • Fixed broken throttling issue.
  • Removed AddressBook features (pre-populating name and email in crash report forms and in-app purchase) so that users aren’t asked to give Arq permission to access contacts.
  • Added explanation of command-line options to Help documents.

To get Arq 3.2, pick “Check for Updates” from Arq’s menu.

Arq plugin for Sidekick

By

April 28th, 2012

Arq Forum member jmah did some reverse-engineering of Arq and posted a message about a plugin he wrote for Sidekick which tells Arq to back up whenever he returns home.

The source code is on github.

Really clever! I love it.

Arq 2.6.9 is out

By

April 28th, 2012

Arq version 2.6.9 is now available!

This minor update fixes several minor issues, including the issue where some backup sets weren’t appearing under “Other Backup Sets”.

It’s a free update for all Arq users. Pick “Check for Updates” from the Arq menu to get the update.

As always, full release notes for all Arq versions are on the release notes page.

How to back up your Mac using Arq

By

July 21st, 2010

When I started developing Arq it was partly because I couldn’t find an existing online backup offering that gave me enough control. I wanted to control exactly which files would be backed up, and I didn’t want to be constrained by rules that many of the “unlimited backup” offerings had like excluding network drives, excluding applications, etc.

So Arq lets you back up anything you want. But then the question is, what should you back up? The following is my suggestion for a basic backup of your files on your Mac.

Basic Backup Using Arq

When you first install and launch Arq, it asks your for your Amazon S3 “keys” and a few other things. Then it asks if you’d like to choose your own files for backup, or back up your home folder minus a few unnecessary items:

Screen shot 2010-07-21 at 8.02.18 AM.png

If you picked “I’ll manually add folders to back up” and you’ve changed your mind, here’s how to set up Arq to back up your home folder minus the unnecessary items:

1. Add your home folder

Click the + button at the bottom left of the Arq main window.

Screen shot 2010-07-21 at 8.10.25 AM.png

Pick your home folder (/Users/<yourname>) and click OK.

Screen shot 2010-07-21 at 9.27.33 AM.png

2. Add some excludes

Click the “Edit Excludes…” button.

Screen shot 2010-07-21 at 8.08.05 AM.png

Add 3 excludes.

Screen shot 2010-07-21 at 8.15.33 AM.png

Make sure the first 2 are set to “relative path” instead of “name”.

Click OK.

Backing Up Applications Using Arq

If you want to back up your applications, add the Applications folder.

Screen shot 2010-07-21 at 8.28.12 AM.png

Many applications put some of their support files in /Library/Application Support, so add that too.

Screen shot 2010-07-21 at 8.29.02 AM.png

Advanced Backup Using Arq

If you prefer, you pick and choose specific folders to back up instead of backing up your entire home directory.

WARNING: If you choose to do this and you later create a new folder in your home directory and start putting important files in there, you’ll have to remember to add this new folder to Arq or else it won’t be backed up!

I back up the following folders as separate items in Arq:

  • Application Support (/Library/Application Support)
  • Applications (/Applications)
  • Documents
  • Library, excluding files/folders named ‘Caches’ and ‘Logs’
  • Music
  • osaka iPhoto Library (my big iPhoto Library, named after my computer), excluding files/folders named ‘iPod Photo Cache’
  • src (my work files), excluding files/folders named ‘build’ and ‘bin’

Time Machine and Arq

Time Machine and Arq are complementary. Backing up using Time Machine to another disk is cheap and fast. If you’re backing up to a Time Capsule via Wifi it’s very convenient because it just happens; there’s nothing to plug in. If you’re backing up to a USB drive, you’ll have to remember to plug in the USB drive periodically. Restoring is fast because you’re reading from a USB disk physically connected to your Mac, or from a Time Capsule over Wifi.

But Time Machine doesn’t cover all cases. If someone breaks in and steals your computer, they may steal your Time Capsule or USB drive as well, and then your files are gone forever. If fire, flood, or lightning strikes, you may lose both your computer and your backups; files gone forever. And if you travel often, you’ll have to bring along your USB drive or Time Capsule, or backups won’t happen until you get home and stay home long enough for a backup to complete.

Arq covers those cases that Time Machine doesn’t. The backups are off site at Amazon’s servers, safe from your theif and your natural disasters. They’re even safe from disaster at an Amazon site because Amazon replicates your data at several sites. And Arq works whenever there’s an Internet connection, so backups still happen when you’re on the road.

Arq 1.5 is out!

By

July 16th, 2010

I’m really excited to ship Arq 1.5!

It includes scheduling options like once-per-day backups and manual-only (one of the most requested features) as well as Pause/Resume and Back Up Now functions. It also includes a whole bunch of refinements and bug fixes.

To get it, pick “Check for Updates” from the menu in Arq, or download it from the product page.

Here are the details:

Feature Additions

  • Configurable backup schedule: hourly, once/day at a certain time of day, or manually.
  • Back Up Now feature.
  • Pause backups for an amount of time you choose. Resume early if you wish. (‘Pause’ is better than ‘stop’ because you won’t have to remember to start it again).
  • Progress indicator next to the “Other Computers” heading in the source list (on the left side of the window) so you can tell when Arq is still scanning for other computers’ backups in the S3 data.
  • More informative status messages such as “Calculating upload size” and “Finishing backup” instead of just “Backing up …”
  • Better communication of error and warning conditions.
  • Estimated backup time is now calculated based on start of backup, not start of calculating upload size.
  • More accurate progress bar in 2 scenarios — when saving the “packs” of small files, and when re-doing an initial backup that was aborted.
  • More useful logging output when log level is set to Info.
  • Much faster loading when browsing backups.

Bug Fixes

  • More efficient caching of the set of objects in S3.
  • Fixed an issue where calculating the upload size for a backup was incomplete when a permission error was encountered.
  • Fixed issues with high memory usage in both Arq and Arq Agent.
  • Fixed an issue that was preventing the “Start at Login” preference from persisting.
  • Fixed 2 issues where packs weren’t being read correctly, leading to “object not found” errors.
  • Fixed regression bug in restoring file permissions correctly for root-owned files.
  • Fixed an issue where the folder’s progress bar was occasionally disappearing.
  • Fixed an issue with trying to read extended attributes on files that don’t support extended attributes.

Enjoy! If you have any feedback or questions I’d love to hear from you! Just email support@haystacksoftware.com. Thanks!

- Stefan

Arq 1.4.4 is out!

By

June 18th, 2010

This release fixes a bug that causes high CPU usage after Arq Agent has been running for many hours and there are backup errors (e.g. folder to be backed up isn’t available).

Pick “Check for Updates” from the Arq menu to automatically update to 1.4.4. Or download Arq here.

Arq 1.4.2 is out!

By

June 4th, 2010

This release fixes a few bugs:

Bug Fixes

  • Added a scroll view around the excludes list when it gets too long to fit on the screen.
  • Fixed ‘Signapore’ typo.
  • Fixed packaging to use symbolic links and remove header files (resulting in smaller app bundle size).

Pick “Check for Updates” from the Arq menu to automatically update to 1.4.2. Or download Arq here.

S3 storage now 33% cheaper!

By

May 20th, 2010

Great news!

Amazon just announced “Reduced Redundancy Storage” (RRS) for S3. Objects stored with the RRS “storage class” are “99.99% durable over a given year,” whereas “standard” storage is 99.999999999% durable (overkill for backup for most people). Objects stored with RRS are only $.10/GB per month ($.11 in the Northern California region — see pricing) — a 33% savings!

Arq with RRS Support

Arq 1.4 is out with support for RRS! Pick “Check for Updates” in the Arq menu to automatically update, or download it from the Arq product page.

For new users of Arq the default is to use RRS. For current users Arq continues to use Standard storage (I didn’t want to assume everyone would want to make the switch).

Migrating to RRS

If you’re a current user, just install the update as described above, then go to the Budget tab of Preferences and check “Use Reduced Redundancy storage class for new objects”:

Screen shot 2010-05-20 at 11.13.20 AM.png

If you want to convert your existing S3 objects to RRS, click the button “Update Storage Class of Existing Objects.” Arq will determine which storage class each object is in:

Picture 2.png

Click “Convert” to migrate:

Picture 3 copy.png

Enjoy the savings!

- Stefan