You are currently browsing the tag archive for the ‘Scalability’ tag.

As the official release of Azure looms, and the initial pricing model is understood, a lot of technical people are crunching numbers to see how much it will cost to host a solution on Azure.  It seems that most of the people doing the comparisons are doing them against smaller solutions to be hosted, not in some corporate on-premise data centre, but on any one of hundreds of public .net hosting providers out there.

This is not surprising since the type of person that is looking at the pre-release version of Azure is also the kind of person that has hundreds of ideas for the next killer website, if only they could find the time and find someone who is a good designer to help them (disclaimer:  I am probably one of those people).  So they look at the pricing model from the perspective of someone who has virtually no experience in running a business and is so technically capable that they have misconceptions about how a small business would operate and maintain a website.

Unsurprisingly they find that Azure works out more expensive than the cost of (perceived) equivalent traditional hosting. So you get statements like this:

“If you add all these up, that’s a Total of $98.04! And that looks like the very minimum cost of hosting an average "small" app/website on Azure. That surely doesn’t make me want to switch my DiscountASP.NET and GoDaddy.com hosting accounts over to Windows Azure.” Chris Pietschmann

Everyone seems shocked and surprised.

Windows Azure is different from traditional hosting, which means that Microsoft’s own financial models and those of their prospective customers are different.  You don’t have to think for very long to come up with some reasons why Microsoft does not price Azure to compete with traditional hosting…

  • Microsoft is a trusted brand.  Regardless of well publicised vulnerabilities (in the technical community) and a growing open source movement, in the mind of business Microsoft is considered low risk, feature rich and affordable.
  • Microsoft has invested in new datacentres and the divisions that own them need to have a financial model that demonstrates a worthwhile investment.  I doubt that in the current economic climate Wall Street is ready for another XBox-like loss leader. (This is also probably the reason why Microsoft is reluctant to package an on-premise Azure)
  • Azure is a premium product that offers parts of the overall solution that are lacking in your average cut-rate hosting environment.

Back to the alpha geeks that are making observations about the pricing of Azure.  Most of them have made the time to look at the technology outside their day job.  They either have ambitions to do something ‘on their own’, are doing it on the side in a large enterprise or, in a few cases, are dedicated to assessing it as an offering for their ISV.

They are not the target market.  Yet.

Azure seems to be marketed at the small to medium businesses that do not have, want or need much in the way of internal, or even contracted, IT services and skills.  Maybe they’ll have an underpaid desktop support type of person who can run around the office getting the owner/manager’s email working – but that is about it. (Another market is the rogue enterprise departments that, for tactical reasons, specifically want to bypass enterprise IT – but they behave similar to smaller businesses.)

Enterprise cloud vendors, commentators and analysts endlessly debate the potential cost savings of the cloud versus established on-premise data centres.  Meanwhile, smaller businesses, whose data centre consists of little more than a broadband wireless router and a cupboard, don’t care much about enterprise cloud discussions.  In addressing the needs of the smaller business, Windows Azure comes with some crucial components that are generally lacking in traditional hosting offerings:

  • As a Platform as a Service (PaaS), there are no low level technical operations that you can do on Azure – which also means that they are taken care of for you.  There is no need to download, test and install patches.  No network configuration and firewall administration.  No need to perform maintenance tasks like clearing up temporary files, logs and general clutter.  In a single tenant co-location hosting scenario this costs extra money as it is not automated and requires a skilled person to perform the tasks.
  • The architecture of Azure, where data is copied across multiple nodes, provides a form of automated backup.  Whether or not this is sufficient (we would like a .bak file of our database on a local disk), the idea and message that it is ‘always backup up’ is reassuring to the small business.
  • The cost/benefit model of Azure’s high availability (HA) offering is compelling.  I challenge anybody to build a 99.95% available web and database server for a couple of hundred dollars a month at a traditional hosting facility or even in a corporate datacentre (this is from the Azure web SLA and works out to 21 minutes of downtime a month).  The degree of availability of a solution needs to be backed up by a business case and often, once the costs are tabled, business will put up with a day or two of downtime in order to save money.  Azure promises significant availability in the box and at the price could be easily justified against the loss of a handful of orders or even a single customer.
  • Much is made of the scalability of Azure and it is a good feature to have in hand for any ambitious small business and financially meaningful for a business that has expected peaks in load.  Related to the scalability is the speed at which you can provision a solution on Azure (scaling from 0 to 1 instances).  Being able to do this within a few minutes, together with all the other features, such as availability, is a big deal because the small business can delay the commitment of budget to the platform until the last responsible moment.

So there are a whole lot of features that need to be communicated to the market – almost like ‘you qualify for free shipping’ when buying a book online, where the consumer is directed to the added value that they understand.

The catch is that the target market does not understand high availability the same way that everyone understands free shipping.  The target market for Azure doesn’t even know that Azure exists, or care – they have a business to run and a website to launch.  Those technical details need to be sorted out by technical people who need to produce the convincing proposal.

The obvious strength that Microsoft has over other cloud vendors is their channel.  Amazon and Google barely have a channel for sales, training and development of cloud solutions – besides, that is not even their core business.  Microsoft has thousands of partners, ISV’s, trainers and a huge loyal following of developers. 

In targeting the small to medium business, Microsoft is pitching Azure at the ISV’s.  The smaller business without internal development capabilities will turn to external expertise, often in the shape of a reputable organization (as opposed to contractors), for solutions – and the ISV’s fulfil that role.  So to get significant traction on Azure, Microsoft needs to convince the ISV’s of the benefits of Azure and, as this post tries to illustrate, some of the details of the financial considerations of the small business and their related technology choices.

Microsoft needs to convince the geeks out there that there is a whole lot more that comes with Azure, that is very important to smaller businesses, that are not available from traditional hosting. So Microsoft needs to help us understand the costs, and not just the technology, in order for us to convince our customers that although Azure is not cheap, it makes good financial sense.

Simon Munro

@simonmunro

I like to ride motorbikes.  Currently I ride a BMW K1200S – a sports tourer that is both fast and comfortable on the road.  Before that I had a five year affair with a BMW R1150GS which took me to all sorts of off-the-beaten-track destinations before we abruptly parted company with me flying through the air in one direction as my bike was smashed in the other direction by criminals in a getaway car.

Most motorbike enthusiasts have, like me, owned a few in their lifetimes and in most cases they are of differing types.  A road bike, no matter how much you are prepared to spend, can barely travel faster than walking pace on a good quality dirt road because, apart from the obvious things like tyres and suspension, the geometry is all wrong.  The converse is similar – a good dirt bike is frustrating, dull and downright dangerous to ride on a road.

Bikers understand the issues around suitability for purpose and compromise more than most (such as car drivers).  Our lottery winning fantasies have a motorbike garage filled, not simply with classics or expense, but with a bike suitable for every purpose and occasion – track, off-road, touring, commuting, cafe racing and every other obvious niche.  Some may even want a Harley Davidson for the odd occasion that one would want to ride a machine that leaks more oil than fuel it uses and one would want to travel in a perfectly straight line for 200 yards before it overheats and the rider suffers from renal damage.

But I digress.  Harley Davidson hogs, fanbois (or whatever the collective noun is for Harley Davidson fans) can move on.  This post has nothing to do with you.

There is nothing in the motorbike world that is analogous to the broad suitability of the SQL RDBMS.  SQL spans the most simple and lightweight up to complex, powerful and expensive – with virtually every variation in between covered.  It is not just motorbikes, a lot of products out there would want such broad suitability – cars, aeroplanes and buildings.  SQL is in a very exclusive club of products that is solves such a broad range of the same problem, and in the case of SQL, that problem is data storage and retrieval.  Also SQL seems to solve this problem in a way that the relationships between load, volume, cost, power and expense is fairly linear.

SQL’s greatest remaining strength and almost industry wide ubiquity is that it is the default choice for storing and retrieving data.  If you want to store a handful of records, you might as well use a SQL database, not text files.  And if you want to store and process huge amounts of transactional data, in virtually all cases, a SQL database is the best choice.  So over time, as the demands and complexity of our requirements has grown, SQL has filled the gaps like sand on a windswept beach, and exclusively filled every nook and cranny.

We use SQL for mobile devices, we use SQL for maintaining state on the web, we use SQL for storing rich media, and use it to replicate data around the world.  SQL has, as it has been forced to satisfy all manner of requirements, been used, abused, twisted and turned and generally made to work in all scenarios.  SQL solutions have denormalization, overly complex and inefficient data models with thousands of entities, and tens of thousands of lines of unmaintainable database code. But still, surprisingly, it keeps on giving as hardware capabilities improve, vendors keep adding features and people keep learning new tricks.

But we are beginning to doubt the knee jerk implementation of SQL for every data storage problem and, at least at the fringes of its capabilities, SQL is being challenged.  Whether it be developers moving away from over-use of database programming languages, cloud architects realising that SQL doesn’t scale out very well, or simply CIO’s getting fed up with buying expensive hardware and more expensive licences, the tide is turning against SQL’s dominance.

But this post is not an epitaph for SQL, or another some-or-other-technology is dead post.  It is rather an acknowledgement of the role that SQL plays – a deliberate metronomic applause and standing ovation for a technology that is, finally, showing that it is not suitable for every conceivable data storage problem.  It is commendable that SQL has taken us this far, but the rate at which we are creating information is exceeding the rate at which we can cheaply add power (processing, memory and I/O performance) to the single database instance.

SQL’s Achilles heel lies in its greatest strength – SQL is big on locking, serial updates and other techniques that allow it to be a bastion for consistent, reliable and accurate data.  But that conservative order and robustness comes at a cost and that cost is the need for SQL to run on a single machine.  Spread across multiple machines, the locking, checking, index updating and other behind the scenes steps suffer from latency issues and the end result is poor performance.  Of course, we can build even better servers with lots of processors and memory or run some sort of grid computer, but then things start getting expensive – ridiculously expensive, as heavy metal vendors build boutique, custom machines that only solve today’s problem.

The scale-out issues with SQL have been known for a while by a small group of people who build really big systems.  But recently the problems have moved into more general consciousness by Twitter’s fail-whale, which is largely due to data problems, and the increased interest in the cloud by developers and architects of smaller systems.

The cloud, by design, tries to make use of smaller commodity (virtualized) machines and therefore does not readily support SQL’s need for fairly heavyweight servers.  So people looking at the cloud find that although there are promises that their application will port easily, are obviously asking how they bring their database into the cloud and finding a distinct lack of answers.  The major database players seem to quietly ignore the cloud and don’t have cloud solutions – you don’t see DB2, Oracle or MySQL for the cloud and the only vendor giving it a go, to their credit (and possibly winning market share), is Microsoft with SQL Server.  Even then, SQL Azure (the version of SQL Server that runs on Azure) has limitations, and size limitations that are indirectly related to the size of the virtual machine on which it runs.

Much is being made of approaches to get around the scale out problems of SQL and with SQL Azure in particular, discussions around a sharding approach for data.  Some of my colleagues were actively discussing this and it led me to weigh in and make the following observation:

There are only two ways to solve the scale out problems of SQL Databases

1. To provide a model that adds another level of abstraction for data usage (EF, Astoria)

2. To provide a model that adds another level of abstraction for more complicated physical data storage (Madison)

In both cases you lose the “SQLness” of SQL.

It is the “SQLness” that is important here and is the most difficult thing to find the right compromise for.  “SQLness” to an application developer may be easy to use database drivers and SQL syntax; to a database developer it may be the database programming language and environment; to a data modeller it may be foreign keys; to a DBA it may be the reliability and recoverability offered by transaction logs.  None of the models that have been presented satisfy the perspectives of all stakeholders so it is essentially impossible to scale out SQL by the definition of what everybody thinks a SQL database is.

So the pursuit of the holy grail of a scaled out SQL database is impossible.  Even if some really smart engineers and mathematicians are able to crack the problem (by their technically and academically correct definition of what a SQL database is), some DBA or developer in some IT shop somewhere is going to be pulling their hair out thinking that this new SQL doesn’t work the way it is supposed to.

What is needed is a gradual introduction of the alternatives and the education of architects as to what to use SQL for and what not to – within the same solution.  Just like you don’t need to store all of your video clips in database blob fields, there are other scenarios where SQL is not the only option.  Thinking about how to architect systems that run on smaller hardware, without the safety net of huge database servers, is quite challenging and is an area that we need to continuously discuss, debate and look at in more detail.

The days are the assumption that SQL will do everything for us is over and, like motorcyclists, we need to choose the right technology or else we will fall off.

Simon Munro

@simonmunro

More posts from me

I do most of my short format blogging on CloudComments.net. So head over there for more current blog posts on cloud computing

RSS Posts on CloudComments.net

  • Free eBook on Designing Cloud Applications
    Too often we see cloud project fail, not because of the platforms or lack of enthusiasm, but from a general lack of skills on cloud computing principles and architectures. At the beginning of last year I looked at how to address this problem and realised that some guidance was needed on what is different with […]
  • AWS and high performance commodity
    One of the primary influencers on cloud application architectures is the lack of high performance infrastructure — particularly infrastructure that satisfies the I/O demands of databases. Databases running on public cloud infrastructure have never had access to the custom-build high I/O infrastructure of their on-premise counterparts. This had led to the wel […]
  • Fingers should be pointed at AWS
    The recent outage suffered at Amazon Web Services due to the failure of something-or-other caused by storms in Virginia has created yet another round of discussions about availability in the public cloud. Update: The report from AWS on the cause and ramifications of the outage is here. While there has been some of the usual […]
  • Microsoft can do it without partners
    Microsoft’s biggest strength has always its partner network and it seemed, at least for a couple of decades, that a strong channel was needed to get your product into the market. Few remember the days where buyers only saw products in computer magazines, computer trade shows and the salespeople walking through the door — the […]
  • The significance of Linux VMs on Windows Azure
    One of the most significant, highly anticipated, and worst kept secrets of the Windows Azure spring release is the inclusion of persistent VMs, with the notable addition of support for Linux on those VMs. The significance of the feature is not that high architecturally — after all, Windows Azure applications that were specifically architected for […]

@simonmunro

Follow

Get every new post delivered to your Inbox.