Friday, January 29, 2010

Non-Relational Storage/Databases and Consistency

With the explosion of cloud computing, non-relational storage/databases (e.g. Google's BigTable) have gained attention in order to scale systems that serve thousands of clients/custormers (I've found this web page that collects projects related to non-relational databases). However, most of the frameworks/PaaS for developing applications using this new logical storage force the developer to use a shared-nothing approach when developing applications (e.g. Google App. Engine and Mircrosoft's Azure). This means that no state can be stored in the application between two invocations from the same client.

Part of the work done on my PhD thesis was related to consistently scale stateful applications running in clusters of servers based on multi-tier architectures (in a LAN). We use transparent replication -in order to hide the clients from the complexity of the replicated architecture- and guarantee snapshot isolation for the data being accessed (stored in relational DBs). In his blog, Werner Vogels (Amazon's CTO), goes one step further and discusses about consistency in the context of stateful applications and the requirements of cloud computing. The Vogel's view on this topic has been applied in projects at Amazon such as Dynamo. Whilst our approach ensures strong consistency, Amazon's approach is eventually consistent.

No comments:

Post a Comment