The same low price as Amazon's Glacier, but with instant retrievals and a cost structure that you don't have to be a superhuman AI to understand? Backblaze's B2 Cloud Storage has a very compelling offer.
B2 Cloud Storage
- GB / month
- USD 0.005
- Transfer out GB
- USD 0.01
- Multipart Uploads
- Multiple Datacenters
B2 provides versioned storage for files, large and small. Just like most cloud storage providers it gives you a flat name space that can be arbitrarily divided into folders, by default using the forward slash as delimiter.
Unlike most other services B2 doesn't provide any encryption of data at rest. Their explanation, that it's done in order to be able to satisfy use cases like
serving public files over the internet[*][a], is quite simply nonsense. While they provide a working example of how to encrypt files[b] that does it properly, with a per-file key encrypted by a master key, it does leave you with the question why they don't do it if it's so simple. Given that encryption is something that is notoriously difficult to get right, having this done in one place by people whose only job is to get it right would take a huge load off users of B2.
Furthermore, B2 only has a single data center, and although they are
committed to adding more data centers and regions over time[*][c] that commitment has not led to any data centers actually opening since it was put in writing early 2016 or three years ago. While the durability is a nice 99.999999%, they themselves admit that
at these probability levels, it's far more likely that (...) an armed conflict takes out data center(s) [or] earthquakes / floods / pests / or other events known as "Acts of God" destroy multiple data centers.[*][d]. Now, with only one data center, the plural forms in that quote can be ignored.
It is therefore best to think of B2 as only one place to store your data. A very reliable one - much more reliable than that portable HD of yours that is only an electrical fault in your computer from being bricked - but still just one. Therefore, follow their 3-2-1 backup strategy[e].
The B2 Cloud Storage API is quite straightforward. You have the usual upload, download, delete and list functions. The two things that stand out are that you'd better make your application concurrent unless you use very large files, and that a B2 bucket is inherently versioned.
Let's take those one by one.
Unlike S3, B2 prefers that you work with large files. The uploads will therefore by default only upload your files in 100 MB chunks. B2 supports uploads in 5 MB chunks, but it's clear that this is not how they intended for the service to be used - you'll have to write a custom client if you use Java to get anything except the default 100 MB chunks. Since you need to run at least four or five upload connections in parallel to have any kind of performance, your files must either be at least about half a gigabyte, or you must upload more than one.
The second peculiarity is that many operations, like retrieval and specifically deletion, only operates on individual versions of files. When you go to delete a file, you may think that calling
b2client.getFileInfoByName and then using the returned
B2FileVersion in the call to
deleteFileVersion will delete the file. But no, it will only delete the latest version of it - if you have any older versions they will become visible as the "latest version" of that file. This can trip you up if you're not used to it.
Transfer rates are about half of S3. A single-connection download reaches 4 MB/s, and a single upload connection maxes out at 1.2 MB/s. Backblaze does recommend to use multipart uploads, but even with 16 parallel uploads the upload rate reaches a so-so 6.5 MB/s. This can be compared to the 10 MB/s I get to Amazon S3 on my 100 Mbit connection, meaning that it's my connection that's the bottleneck - not Amazon.
B2 is storage for data that is at rest most of the time, and even at 6.5 MB/s you could upload half a terabyte a day. But the greatest impact is of course in the initial data load, and you may find it a not-completely-stellar welcome to the new service when you realize it's going to take weeks to just get going - even if you know that you'll be fine once the initial load is completed.
If S3 beats B2 on performance, B2 annihilates S3 on pricing. USD 0.005, half a cent, per gigabyte-month is just one tenth of a cent more than Amazon's Glacier cost, and with B2 you get instant retrieval for 1 cent per GB instead of 3 cents for retrieval with 1-5 minute delay or 1 cent for retrieval after 3-5 hours.
Competitors to B2 have at times offered cheaper storage costs, but all have increased their prices.
If all you want to do is store data and only occasionally look at it, Backblaze's B2 is peerless. Cheap but a bit slow, if S3 beats B2 on performance, B2 annihilates S3 on pricing.
|More likely is that their storage hardware has so much data per processor that on-the-fly decryption would be infeasible without changing the whole cost equation.|