You are currently browsing the category archive for the ‘Azure’ category.

An investigation triggered by the lack of support of spatial data in SQL Azure has left me with the (unconfirmed) opinion that although requested by customers, the support of spatial data in SQL Azure may not be good enough to handle the requirements of a scalable solution that has mapping functionality as a primary feature.

Update: SQL Azure now has spatial support. The arguments made in this post are still valid and use of spatial features in SQL Azure should be carefully considered.

I have been asked to investigate the viability of developing a greenfields application in Azure as an option to the currently proposed traditional hosting architecture.  The application is a high load, public facing, map enabled application and the ability to do spatial queries is on near the top of the list of absolute requirements.  The mapping of features from the traditionally hosted architecture is fine until reaching the point of SQL 2008’s spatial types and features which are unsupported under SQL Azure – triggering further investigation.

It would seem that the main reason why spatial features are not supported in SQL Azure is because those features make use of functions which run within SQLCLR, which is also unsupported in SQL Azure.  The lack of support for SQLCLR is understandable to a a degree due to how SQL Azure is setup – messing around with SQLCLR on multitenant databases could be a little tricky.

The one piece of good news is that some of the assemblies used by the spatial features in SQLCLR are available for .NET developers to use and are installed into the GAC on some distributions (R2 amongst them) and people have been able to successfully make use of spatial types using SQL originated/shared managed code libraries.  Johannes Kebeck, the Bing maps guru from MS in the UK, has blogged on making use of these assemblies and doing spatial oriented work in Azure.

So far, it seems like there may be a solution or workaround to the lack of spatial support in SQL Azure as some of the code can be written in C#.  However, further investigation reveals that those assembles are only the types and some mathematics surrounding them and the key part of the whole process, a spatial index, remains firmly locked away in SQL Server and the inability to query spatial data takes a lot of the goodness out of the solution.

No worries, one would think – all that you need to do is get some view into the roadmap of SQL Azure support for SQL 2008 functionality and you can plan or figure it out accordingly.  After all, the on the Microsoft initiated, supported and sanctioned SQL Azure User voice website mygreatsqlazureidea.com, the feature ‘Support Spatial Data Types and SQLCLR’ comes out at a fairly high position five on the list with the insightful comment ‘Spatial in the cloud is the killer app for SQL Azure. Especially with the proliferation of personal GPS systems.’ The SQL Azure team could hardly ignore that observation and support – putting it somewhere up there on their product backlog.

When native support for spatial data in SQL Azure is planned is another matter entirely and those of us on the outside can only speculate.  You could ask Microsoft directly, indirectly or even try and get your nearest MVP really drunk and, when offered the choice between breaking their NDA and having compromising pictures put up on Facebook, will choose the former.

Update: You use your drunk MVP to try and glean other information as it was announced that SQL Azure will support spatial data in June 2010 http://blogs.msdn.com/sqlazure/archive/2010/03/19/9981936.aspx and  http://blogs.msdn.com/edkatibah/archive/2010/03/21/spatial-data-support-coming-to-sql-azure.aspx (see comments below).  This is not a solution to all geo-aware cloud applications, so I encourage you to read on.

I have n-th hand unsubstantiated news that the drastic improvements for spatial features in SQL 2008 R2 were made by taking some of the functionality out of SQLCLR functions and putting them directly into the SQL runtime which means that even a slightly deprecated version of SQL Azure based on R2, which I think is inevitable, would likely have better support for spatial data.

Update:  In the comments below, Ed Katibah from Microsoft, confirms that the spatial data support is provided by SQL CLR functionality and not part of the R2 runtime.

In assessing this project’s viability as an Azure solution, I needed to understand a little bit more about what was being sacrificed by not having SQL spatial support and am of the opinion that it is possibly a benefit.

Stepping back a bit, perhaps it is worthwhile trying to understand why SQL has support for spatial data in the first place.  After all, it only came in SQL 2008, mapping and other spatial applications have been around longer than that and, to be honest, I haven’t come across many solutions that use the functionality.  To me, SQL support of spatial data is BI Bling – you can, relatively cheaply (by throwing a table of co-ordinates against postal codes and mapping your organizations regions) have instant, cool looking, pivot tables, graphs, charts and other things that are useful in business.  In other words, the addition of spatial support adds a lot of value to existing data,  whose transactions do not really have a spatial angle.  The spatial result is a side effect of (say) the postal code, which is captured for delivery reasons rather than explicit BI benefits.

The ability to pimp up your sales reports with maps, while a great feature that will sell a lot of licences, probably belongs as a feature of SQL Server (rather than the reporting tool), I question the value of using SQL as the spatial engine for an application that has spatial functionality as a primary feature.  You only have to think about Google maps, streetview and directions with the sheer scale of the solution and the millions of lives it affects and ask yourself whether or not behind all the magic there is some great big SQL database serving up the data.  Without knowing or Googling the answer, I would suggest with 100% confidence that the answer is clearly ‘No’.

So getting back to my Azure viability assessment, I found myself asking the question.

If SQL Azure had spatial support, would I use it in an application where the primary UI and feature set is map and spatially oriented?

But before answering that I asked,

Would I propose an architecture that used SQL spatial features as the primary spatial data capability for a traditionally hosted application where the primary UI and feature set is map and spatially oriented?

The short answer to both questions is a tentative no.  Allow me to provide the longer answer.

The first thing to notice about spatial data is that things that you are interested in the location of don’t really move around much.  The directions from Nelsons Column to Westminster Abbey are not going to change much and neither are the points of interest along the way.  In business you have similar behaviour – customers delivery addresses don’t move around much and neither do your offices, staff and reporting regions.  The second thing about spatial data is the need to have indexes so that queries, such as the closest restaurants to a particular point, can be done against the data and spatial indexes solve this problem by providing tree like indexing in order to group together co-located points.  These indexes are multidimensional in nature and a bit more complex than the flatter indexes that we are used to with tabular data.

Because of the slow pace at which coastlines, rivers, mountains and large buildings move around, the need to have dynamically updated spatial data, and hence their indexes, is quite low.  So while algorithms exist to add data to spatial indexes, the cost of performing inserts is quite expensive, so in many cases indexes can be rebuilt from scratch whenever there is a bulk modification or insert of the underlying data.

So while SQL Server 2008 manages spatial indexes as with any other index, namely by updating the index when underlying data changes, I call into question the need for having such functionality for data that is going to seldom change.

If data has a low rate of change, spatial or not, it becomes a candidate for caching, and highly scalable websites have caching at the core of their solutions (or problems, depending on how much they have).  So if I were to scale out my solution, is it possible to cache the relatively static data and the spatial indexes into some other data store that is potentially distributed across many nodes of my network?  Unfortunately, unlike a simple structure like a table, the data within a spatial index (we are talking about the index here and not the underlying data) is wrapped up closely to the process or library that created it.  So, in the case of SQL Server, the spatial index is simply not accessible from anywhere other than SQL Server itself.  This means that I am unable to cache or distribute the spatial indexes unless I replicate the data to another SQL instance and rebuild the index on that instance.

So while I respect the functionality that SQL Server offers with spatial indexing, I question the value of having to access indexed data in SQL server just because it seems to be the most convenient place to access the required functionality (at least for a Microsoft biased developer).  If my application is map oriented (as opposed to BI bling), how can I be sure that I won’t run into a brick wall with SQL server with spatial indexes in particular.  SQL server is traditionally known as a bottleneck with any solution and putting my core functionality into that bottleneck, before I have even started and without much room to manoeuvre is a bit concerning.

I should be able to spin up spatial indexes wherever I want to and in a way that is optimal for a solution.  Perhaps I can have indexes that focus on the entire area at a high level and can generate lower level ones as required.  Maybe I can pre-populate some indexes for popular areas or if an event is going to take place in a certain area.  Maybe I am importing data points all of the time and don’t want SQL spending time churning indexes as data, which I am not interested in yet, is being imported.  Maybe I want to put indexes on my rich client so that the user has a lighting fast experience as they scratch around in a tiny little part of the world that interests them.

In short, maybe I want a degree of architectural and development control over my spatial data that is not provided my SQL’s monolithic approach to data.

This led me to investigating other ways of dealing with spatial data (generally), but more specifically spatial indexes.  Unsurprisingly there are a lot of algorithms and libraries out there that seem to have their roots in a C and Unix world.  The area of spatial indexing is not new and a number of algorithms have emerged as popular mechanisms to build spatial indexes.  The two most popular are R-Tree (think B-Tree for spatial data) and Quadtree (where a tree is built up by dividing areas into quadrants).

There is a wealth of information on these fairly well understood algorithms and event Microsoft’s own implementations do not fall far from these algorithms.  Bing maps uses ‘QuadKeys’ to index tiles, seemingly referring to the underlying Quadtree index.  (SQL Server is a bit different though, it uses a four level grid indexing mechanism that is non recursive and uses tessellation to set the granularity of the grid.)

So if all of this spatial data stuff is old hat, surely there are some libraries available for implementing your own spatial indexes in managed code?  It seems that there are some well used open source libraries and tools available.  Many commercial products and Sharpmap, an OSS GIS library, make use of NetTopologySuite, a direct port of the Java based JTS.  These libraries have a lot of spatial oriented functions, most of which only make vague sense to me, including a read only R-Tree implementation.

Also, while scratching around, I got the sense that Python has emerged as the spatial/GIS language of choice (it makes sense considering all those C academics started using Python).  It seems that there are a lot of Python libraries out there that are potentially useful within a .NET world using IronPython.

It is still early in my investigation, but I can’t help shaking the feeling that making use of SQL 2008 for spatial indexing because that is the only hammer that Microsoft provides is not necessarily the best solution.  This is based on the following observations:

  • Handling of spatial data is not new – it is actually a mature part of computer science.  In fact SQL server was pretty slow to implement spatial support.
  • An RDBMS like SQL or Oracle may be a good place to store data, but not necessarily the best place to have your indexes.  The SQL bias towards data consistency and availability are counter to the demands of spatial data and their indexes.
  • In order to develop a map oriented solution, a fine degree of control over spatial data may be required to deliver the required functionality at scale.

While I am not against OSS, evaluating libraries can be risky and difficult and I am stunned at the lack of support for spatial data in managed code coming out of Microsoft.  Microsoft needs to pay attention to the demand for support of spatial data for developers (not just database report writers).  The advent of always connected geo-aware mobile devices and their users’ familiarity with maps and satnav, will push the demand for applications that are supportive of geographic data.  It is not unlikely to picture the teenager demand for a map on their mobile devices that shows the real time location of their social network.

To support this impending demand, Microsoft needs to make spatial data a first class citizen of the .NET framework (system.spatial).  It wouldn’t take much, just get some engineers from SQL and Bing maps to talk to each other for a few weeks.  Microsoft, if you need some help with that, let me know.

In the meantime I will walk down the road of open source spatial libraries and let you know where that road leads.

Simon

@simonmunro

On 1 February 2010, when Microsoft Azure officially goes into production, the CTP version will come to an end.  In an instant, thousands of Azure apps in some of the remotest corners of the Internet, built with individual enthusiasm and energy, will wink out of existence – like the dying stars of a discarded alternative universe.

Sadly, the only people that will notice are the individual developers who took to Azure, figured out the samples and put something, anything, out there on The Cloud and beamed like proud fathers and remembering their first Hello World console app.  For the first time we were able to point to a badly designed web page that was, both technically and philosophically, In The Cloud.  Even though the people that we showed barely gave it a second look (it is, after all, unremarkable on the surface) we left it up and running for all the world to see.

Now, Microsoft, returning to its core principles of being aggressively commercial, is taking away the Azure privilege and leaving the once enthusiastic developers feeling like petulant children the week after Easter – where the relaxing of the chocolate rations has come to an end.  Now, developers are being asked to put in their credit cards to make use of Azure – even the free one.  Now I don’t know about anyone else’s experiences, but in mine ‘free’ followed by ‘credit card details please’ smells like a honey trap.

So its not enough that we have to scramble up the learning curve of Azure, install the tools and figure things out all on our own time, we now also have to hand over our credit card details to a large multinational that has a business model that keeps consumers at an arms length, is intent on making money, and may give you a bill for an indeterminable amount of computing resources consumed – all for which you are personally liable.

Gulp! No thanks, I’ll keep my credit card to myself if you don’t mind.

The nature of Azure development up until now and until adoption becomes mainstream is that most Azure development has no commercial benefit for the developers.  While some companies are working on Azure ‘stuff’, there is very little in the way of Azure apps out there in the wild and even fewer customers who are prepared to pay for Azure development… yet.  A lot of the Azure ‘development’ that I am aware of has been done by individuals, in their own time, on side projects as they play with Azure to get on the cloud wave, enhance their understanding or simply try something different.

While I understand Microsoft’s commercial aspirations, the financial commitments expected from Azure ‘hobbyists’ run the risk of choking the biggest source of interest, enthusiasm and publicity – the after hours developer.  Perhaps the people in the Azure silo who are commenting ‘Good riddance to the CTP developers, they were using up all of these VM’s and getting no traffic’ have not seen the Steve Ballmer ‘Developers! Developers! Developers!’ monkey dance that (embarrassingly) acknowledges the value of the influence that developers who are committed to a single platform (Windows).

It comes as no surprise that the number one feature voted for in the Microsoft initiated ‘Windows Azure Feature Voting Forum’ is ‘Make it less expensive to run my very small service on Windows Azure’ followed by ‘Continue Azure offering free for Developers’ – the third spot has less than a quarter as many votes.  But it seems that nobody is listening – instead they are rubbing their hands in glee, waiting for the launch and expecting the CTP goodwill to turn into credit card details.

Of course there is a limp-dicked ‘free’ account that will suggestively start rubbing up against your already captured credit card details after 25 hours of use (maybe).  There is also some half-cocked free-ish version for MSDN subscribers – for those that are fortunate enough to get their employers to hand over the keys (maybe).  So there are roundabout ways that a developer can find a way of getting themselves up and running on the Azure platform but it may just be too much hassle and risk to bother.

Personally, I didn’t expect it to happen this way, secretly hoping that @smarx or someone on our side would storm the corporate fortress and save us from their short sightedness and greed.  But alas, the regime persists – material has been produced, sales people are trained and the Microsoft Azure army is in motion.  There won’t even be a big battle.  Our insignificant little apps will simply walk up, disarmed, to their masters with their heads hung in shame and as punishment for not being the next killer app, they will be terminated – without so much as a display of severed heads in the town square.

Farewell Tweetpoll, RESTful Northwind, Catfax and others.

We weren’t given a chance to know you.  You are unworthy.

Simon

@simonmunro

As the official release of Azure looms, and the initial pricing model is understood, a lot of technical people are crunching numbers to see how much it will cost to host a solution on Azure.  It seems that most of the people doing the comparisons are doing them against smaller solutions to be hosted, not in some corporate on-premise data centre, but on any one of hundreds of public .net hosting providers out there.

This is not surprising since the type of person that is looking at the pre-release version of Azure is also the kind of person that has hundreds of ideas for the next killer website, if only they could find the time and find someone who is a good designer to help them (disclaimer:  I am probably one of those people).  So they look at the pricing model from the perspective of someone who has virtually no experience in running a business and is so technically capable that they have misconceptions about how a small business would operate and maintain a website.

Unsurprisingly they find that Azure works out more expensive than the cost of (perceived) equivalent traditional hosting. So you get statements like this:

“If you add all these up, that’s a Total of $98.04! And that looks like the very minimum cost of hosting an average "small" app/website on Azure. That surely doesn’t make me want to switch my DiscountASP.NET and GoDaddy.com hosting accounts over to Windows Azure.” Chris Pietschmann

Everyone seems shocked and surprised.

Windows Azure is different from traditional hosting, which means that Microsoft’s own financial models and those of their prospective customers are different.  You don’t have to think for very long to come up with some reasons why Microsoft does not price Azure to compete with traditional hosting…

  • Microsoft is a trusted brand.  Regardless of well publicised vulnerabilities (in the technical community) and a growing open source movement, in the mind of business Microsoft is considered low risk, feature rich and affordable.
  • Microsoft has invested in new datacentres and the divisions that own them need to have a financial model that demonstrates a worthwhile investment.  I doubt that in the current economic climate Wall Street is ready for another XBox-like loss leader. (This is also probably the reason why Microsoft is reluctant to package an on-premise Azure)
  • Azure is a premium product that offers parts of the overall solution that are lacking in your average cut-rate hosting environment.

Back to the alpha geeks that are making observations about the pricing of Azure.  Most of them have made the time to look at the technology outside their day job.  They either have ambitions to do something ‘on their own’, are doing it on the side in a large enterprise or, in a few cases, are dedicated to assessing it as an offering for their ISV.

They are not the target market.  Yet.

Azure seems to be marketed at the small to medium businesses that do not have, want or need much in the way of internal, or even contracted, IT services and skills.  Maybe they’ll have an underpaid desktop support type of person who can run around the office getting the owner/manager’s email working – but that is about it. (Another market is the rogue enterprise departments that, for tactical reasons, specifically want to bypass enterprise IT – but they behave similar to smaller businesses.)

Enterprise cloud vendors, commentators and analysts endlessly debate the potential cost savings of the cloud versus established on-premise data centres.  Meanwhile, smaller businesses, whose data centre consists of little more than a broadband wireless router and a cupboard, don’t care much about enterprise cloud discussions.  In addressing the needs of the smaller business, Windows Azure comes with some crucial components that are generally lacking in traditional hosting offerings:

  • As a Platform as a Service (PaaS), there are no low level technical operations that you can do on Azure – which also means that they are taken care of for you.  There is no need to download, test and install patches.  No network configuration and firewall administration.  No need to perform maintenance tasks like clearing up temporary files, logs and general clutter.  In a single tenant co-location hosting scenario this costs extra money as it is not automated and requires a skilled person to perform the tasks.
  • The architecture of Azure, where data is copied across multiple nodes, provides a form of automated backup.  Whether or not this is sufficient (we would like a .bak file of our database on a local disk), the idea and message that it is ‘always backup up’ is reassuring to the small business.
  • The cost/benefit model of Azure’s high availability (HA) offering is compelling.  I challenge anybody to build a 99.95% available web and database server for a couple of hundred dollars a month at a traditional hosting facility or even in a corporate datacentre (this is from the Azure web SLA and works out to 21 minutes of downtime a month).  The degree of availability of a solution needs to be backed up by a business case and often, once the costs are tabled, business will put up with a day or two of downtime in order to save money.  Azure promises significant availability in the box and at the price could be easily justified against the loss of a handful of orders or even a single customer.
  • Much is made of the scalability of Azure and it is a good feature to have in hand for any ambitious small business and financially meaningful for a business that has expected peaks in load.  Related to the scalability is the speed at which you can provision a solution on Azure (scaling from 0 to 1 instances).  Being able to do this within a few minutes, together with all the other features, such as availability, is a big deal because the small business can delay the commitment of budget to the platform until the last responsible moment.

So there are a whole lot of features that need to be communicated to the market – almost like ‘you qualify for free shipping’ when buying a book online, where the consumer is directed to the added value that they understand.

The catch is that the target market does not understand high availability the same way that everyone understands free shipping.  The target market for Azure doesn’t even know that Azure exists, or care – they have a business to run and a website to launch.  Those technical details need to be sorted out by technical people who need to produce the convincing proposal.

The obvious strength that Microsoft has over other cloud vendors is their channel.  Amazon and Google barely have a channel for sales, training and development of cloud solutions – besides, that is not even their core business.  Microsoft has thousands of partners, ISV’s, trainers and a huge loyal following of developers. 

In targeting the small to medium business, Microsoft is pitching Azure at the ISV’s.  The smaller business without internal development capabilities will turn to external expertise, often in the shape of a reputable organization (as opposed to contractors), for solutions – and the ISV’s fulfil that role.  So to get significant traction on Azure, Microsoft needs to convince the ISV’s of the benefits of Azure and, as this post tries to illustrate, some of the details of the financial considerations of the small business and their related technology choices.

Microsoft needs to convince the geeks out there that there is a whole lot more that comes with Azure, that is very important to smaller businesses, that are not available from traditional hosting. So Microsoft needs to help us understand the costs, and not just the technology, in order for us to convince our customers that although Azure is not cheap, it makes good financial sense.

Simon Munro

@simonmunro

More posts from me

I do most of my short format blogging on CloudComments.net. So head over there for more current blog posts on cloud computing

RSS Posts on CloudComments.net

  • Free eBook on Designing Cloud Applications
    Too often we see cloud project fail, not because of the platforms or lack of enthusiasm, but from a general lack of skills on cloud computing principles and architectures. At the beginning of last year I looked at how to address this problem and realised that some guidance was needed on what is different with […]
  • AWS and high performance commodity
    One of the primary influencers on cloud application architectures is the lack of high performance infrastructure — particularly infrastructure that satisfies the I/O demands of databases. Databases running on public cloud infrastructure have never had access to the custom-build high I/O infrastructure of their on-premise counterparts. This had led to the wel […]
  • Fingers should be pointed at AWS
    The recent outage suffered at Amazon Web Services due to the failure of something-or-other caused by storms in Virginia has created yet another round of discussions about availability in the public cloud. Update: The report from AWS on the cause and ramifications of the outage is here. While there has been some of the usual […]
  • Microsoft can do it without partners
    Microsoft’s biggest strength has always its partner network and it seemed, at least for a couple of decades, that a strong channel was needed to get your product into the market. Few remember the days where buyers only saw products in computer magazines, computer trade shows and the salespeople walking through the door — the […]
  • The significance of Linux VMs on Windows Azure
    One of the most significant, highly anticipated, and worst kept secrets of the Windows Azure spring release is the inclusion of persistent VMs, with the notable addition of support for Linux on those VMs. The significance of the feature is not that high architecturally — after all, Windows Azure applications that were specifically architected for […]

@simonmunro

Follow

Get every new post delivered to your Inbox.