This post is part of a series called “Windows Azure for the ASP.NET Developer” written by Rachel Appel, Adam Hoffman (that's me), and Peter Laudati. You can see the complete list of posts in the series at the US DPE Azure Connection site.
As an ASP.NET developer, you’ve undoubtedly built applications that have requirements for persistent data storage. Anything but the most trivial application likely has this need. Depending on the scale of your needs, you might have turned to:
- storing your data on the local file system,
- storing your data on a shared directory,
- storing your data in a relational database like SQL Server (this could be either the full SQL Server editions, or the more modest SQL Server Compact edition)
Different Names for the Same Thing
In the cloud, you have comparable options, but using different methods. Let’s go through them.
Local File System
Local disk is available to each Windows Azure role, but this is not persistent. If your role process goes down it may be restarted on another node, so the local disk is not for persistent data. If you want to translate your use of the local file system to Azure, you can use Windows Azure Drives (also known as XDrives), which is really the use of a “blob” in the cloud, mounted as a virtual hard drive. See http://aka.ms/XDrives for more details. Note that this solution is simple, but doesn’t scale to multiple writer instances, as Windows Azure Drives are currently limited to allow only one instance at a time to mount a Windows Azure Drive for writing (although multiple instances can read simultaneously).
If you’re currently storing your data in a shared directory, the Windows Azure Drives might provide a quick alternative as well, provided your needs are modest and you can live with a single instance (or at least a single writer instance). Many applications can make minor modifications and take quick advantage of Windows Azure Drives. For a thorough review of creating, mounting and otherwise working with XDrives, see the Working with Drives lab in the Windows Azure Training Kit at http://aka.ms/AzureDrives. Also, it turns out that there is a workaround if you need to allow multiple writers to a single drive - see http://aka.ms/XDriveMultiWrite for a creative solution to the problem. It takes a little glue up, but might well work for your situation.
If you have been using SQL Server for your data persistence needs, there’s a very straightforward transition to the cloud using SQL Azure or SQL Azure Federations (if you had lots and lots of data). For the most part, this is really just SQL Server in the cloud, and moving to it is very straightforward. For complete coverage of using SQL Azure, see Rachel Appel’s post in the series. Additionally, if you’re looking for tools to help you migrate your SQL Server data to SQL Azure, you should look at George Huey’s SQL Azure Migration Wizard, or SQL Azure Federation Migration Wizard. Since the SQL Azure topic is so well covered in Rachel’s post, we’ll not discuss it further here.
Now, you might think that we’ve pretty much covered all the methods that you need to know about for persistent data storage in the cloud, but it turns out that Azure has a couple of other tricks up its sleeve. Let’s take a look now at the other Windows Azure Storage components – Tables, Blobs and Queues.
(Not such) Tiny Vessels
The Windows Azure Storage services are broken into four pieces, each one specialized for a different purpose. Each of them allows for storage of vast amounts of data, but vary based on the service. They are:
- Queues, and,
We’ve already covered Drives above, so now let’s take a look at our other choices.
Windows Azure Storage Tables are extremely scalable, and can hold billions of rows (or objects) and terabytes of data. Azure will take care of keeping these tuned, and can potentially spread them over thousands of machines in the data center. Additionally, your data is replicated over two data centers, which provides for disaster recovery in the very unlikely event that an entire data center is disabled. The details of this are at http://aka.ms/StorageGeoReplication.
Tables are both similar to and different from SQL Server tables, and can take some getting used to, but for non-relational data needs, they’re hard to beat. In order to understand Tables, you’ll need to understand partition keys (which support scalability of tables) and row keys (which are the unique key per partition, very similar to primary keys in SQL Server). For an excellent overview of Tables, see Julie Lerman’s MSDN article at http://aka.ms/AzureTables. To really understand the nitty gritty around choosing partition and row keys, take a look at Jai Haridas’ session from PDC09 at http://www.microsoftpdc.com/2009/svc09 .
As an ASP.NET developer, it might take some time to get used to not relying (only) on SQL Server (or SQL Azure) for all of your tabular persistent data needs, but you in cases where relational consistency isn’t the high order bit, and scalability and performance are, take a serious look at moving parts (or all) of your data to Azure Table storage.
Windows Azure Storage Blobs provide the ability to store and serve large amounts of unstructured data anywhere in the world. They support security, so you can expose data publicly or with fine grained control, over HTTP or HTTPS. Additionally, they can handle huge amounts of data. A single blob can be up to a terabyte in size, and a single storage account in Azure can handle up to 100 terabytes of data! They are handy for all sorts of uses, including the serving of relatively static content from a website (images, stylesheets, etc.), media objects, and much more.
As an ASP.NET developer, you’ve likely grown used to hosting your website assets on your web servers, or maybe on a file server shared amongst your web servers. This is a pretty typical pattern, but Blobs provide you with the opportunity to move these resources. Pick up your images, stylesheets and script, and move them to Blob storage. Why should you move them? There are at least three good reasons.
- Moving them to Blob storage instead of your web roles reduces load on your valuable web servers, freeing them up to serve more traffic more quickly.
- Because you can’t easily update folders on your Web Roles without a redeployment, moving them to Blob storage allows you to make changes to your assets without redeploying.
- Moving them to Blob storage allows you to easily enable the Azure Content Delivery Network (CDN) if you want to really push the performance needle. Learn more about the CDN at http://aka.ms/CDN.
For a thorough walkthrough of all aspects of working with blobs, see the guide at http://aka.ms/Blobs. For a thorough overview of Blobs, Brad Calder’s PDC talk is hard to beat. See it at http://www.microsoftpdc.com/2009/SVC14.
Windows Azure Storage Queues are a straightforward, reliable queue message storage system. They provide an infrastructure to store up work for an army of worker roles that you can create. Think of these as the orders coming into a very busy kitchen, staffed by as many cooks as you want to spin up worker roles. Additionally, the Queue has the smarts to be sure that the cook who takes one of the orders actually does his job and delivers back the order – if he doesn’t (in a time that you specify), then the order will go back into the queue so that one of your more reliable employees can get it done.
For a great overview of some of the new features of queues (including larger message sizes and the ability to schedule messages into the future) see http://aka.ms/NewQueueFeatures.
There’s additional great information about Windows Azure Storage Queues contained in Jai Haridas’ PDC09 session as well at http://www.microsoftpdc.com/2009/svc09. Finally, for a quick lab using Queues and Blob storage, you can see my post at http://www.stratospher.es/blog/post/getting-started-with-loosely-coupled-applications-using-azure-storage-and-queues.
As an ASP.NET developer, Queues offer you a chance to handle complex information processing needs in a very elegant way. Perhaps you’ve found yourself needing to do some sort of long running process from your web application, like uploading and processing images, or the like. In the past, you either would synchronously handle the processing in the UI thread on the server, keeping the user waiting for completion, or roll your own storage, queuing and processing. With Azure Queues, there are much more elegant, reliable and performant methods available to you. For an example of taking your processing off the UI thread, and improving user satisfaction, see my previous post at http://www.stratospher.es/blog/post/getting-started-with-loosely-coupled-applications-using-azure-storage-and-queues.
To understand why Windows Azure offers these new methods of storing persistent data, it’s important to remember that Azure is a Platform as a Service, as opposed to Infrastructure as a Service. As a result of this, it can offer additional platform services that are useful for application developers in common application scenarios. The Windows Azure team has codified several of these services in Windows Azure Storage services, which consists of Tables, Blobs and Queues.
For What Reason
So you might be thinking, “hey, it’s great that these choices exist, but why would I choose them?” After all, you already have SQL Azure available, which is very straightforward to work with – why take the time to write different code and use Tables, Blobs and Queues instead?
One possible reason is cost. Despite the fact that SQL Azure is extremely reasonably priced, if you have lots and lots of data, Table storage is available for a fraction of the cost. In fact, at the time of this writing, a gigabyte of data in a SQL Azure database costs (a very reasonable) $9.99 per month. A gigabyte of data in table storage, however, would only cost $0.125 (yes, that’s 12.5 cents) per month. Prices vary as you go up a bit (SQL Azure is discounted for each gigabyte over the first – a 10 GB database is only $45.96 per month), but in general, Table storage is a tiny fraction of the cost of SQL Azure storage.
OK, so it’s cheaper, but how do you decide between the options from a capabilities standpoint. Table storage is cheaper, but if it doesn’t have the capabilities we need, we’ll still need SQL Azure, right?
Generally speaking, the choice comes down to the nature of the data. If you have relational data, and need relational data access, you’ll enjoy the full relational capabilities of SQL Azure. Windows Azure Table Storage, on the other hand, doesn’t easily lend itself to relational querying, and is less useful in this situation. However, if the nature of your data is more object based (or maybe file based), then Windows Azure Storage (either Tables or Blobs) will suit your needs, and save you money at the same time. Many applications will have a blend of these needs, and it’s not at all unreasonable to think that you’ll end up using a combination of these technologies to best cover your requirements. One of the great advantages of the rich suite of Azure technologies is that you have the luxury of choosing from multiple solutions to best suit your needs.
For a rich comparison of SQL Azure and Windows Azure Table storage, see Joseph Fultz’s MSDN Magazine article at http://aka.ms/SQLAzureVsAzureTables.
Does this all sound a little familiar? Maybe you’ve been reading up on so-called NoSQL technologies like CouchDB and MongoDB and are wondering, is this the same thing? In many ways, the answer is yes. In fact Windows Azure Tables are very much a type of NoSQL data store. For a very rich discussion of NoSQL, and Azure Table’s place within it, see the whitepaper at http://aka.ms/AzureNoSQL.
And We’d Brave Those Mountain Passes
OK, we’ve talked about a bunch of useful technologies, and you’re (probably) intrigued right about now. Maybe you’re planning on using Tables or Blobs, but have existing data that you’ll need to move in order to do so. Are there any tools that can help with our move to Windows Azure Storage?
As it turns out there is. A company called Cerebrata makes a tool called Cloud Storage Studio which, in addition to allowing you to interrogate and work with your Tables, Blobs and Queues, has a feature for uploading data from SQL Server databases. You can read about that feature at http://aka.ms/CerebrataDataMigrate or check out the whole product at http://aka.ms/CerebrataCloudStorageStudio . Other great tools for working with your storage exist as well. You could try:
- Cloudberry Lab’s Explorer for Azure Blob Storage (http://aka.ms/CloudberryFree for Freeware, http://aka.ms/CloudberryPro for Pro)
- Neudesic’s community donated Explorer (http://aka.ms/StorageExplorer for the community version, which allows importing data from CSV files)
If you really want to get the deepest details on the mechanics of Windows Azure Storage, there’s a remarkable paper (and video) at http://aka.ms/StorageMadScience. Mad science, it is, and all so that you can rely on this mechanism for your storage needs.
A Movie Script Ending
So, some of you might have noticed something while you were reading this article. Did you catch it? Feel free to tweet me your guess at @stratospher_es . The first one to figure it out gets a (completely nominal) prize from me, and a hearty “congratulations”.
Ready to start developing for Azure? Use the pretty colored boxes at the top of this post to either activate your free Azure benefits if you're a MSDN subscriber, or a 90 day trial. Once you've done that, get the tools and get going!