Saturday, October 2, 2010

Google Cloud Vs Amazon Cloud - An architectural perspective - Part 1

In this series of blog entries, I'll explore the two main cloud platforms available to Java developers today: Google's Cloud offering (called the Google App Engine) and Amazon's Cloud offering (called Amazon Web Services).  I'll make the comparison from a technical perspective.  If you are a developer/architect evaluating these options, you might find this useful.

Google App Engine (GAE)

Google App Engine (GAE) is Google's solution to your cloud computing requirements. GAE follows a container-based architecture. If you are a Java developer,  you are probably familiar with the concept of a Servlet Container or a Java EE container. In a container-based architecture, you write a component based of certain interfaces, You drop the component (deploy) into the container. And then, the container manages the component and its life cycle for you. For example, you write a Servlet component and drop it into your servlet container. On a web request, the servlet container initializes your servlet and calls the doGet method automatically for you.

The Google App Engine (GAE) is also a container. You write a Google App Engine App (GAE App)  and drop it (deploy) into the Google App Engine and the App Engine manages the app for you.   For example,  you can build an online discussion forum as a GAE app.  You deploy it to GAE and let GAE handle the provisioning for you.  When users hit your application, GAE will automatically load your GAE app and serve the requests. If there are no users accessing your application, GAE puts your application to "sleep", saving your from using up valuable CPU cycles.  On a new web request, GAE can "wake up" your application and service the request. If your application recieves a large number of requests GAE will replicate your application across multiple servers or "scale it up" to meet the demand. Likewise as your load eases, GAE can "scale it down". All this is done with no intervention from you. That's the promise of  GAE.

GAE also offers a variety of services that is available to your GAE app.  Need a database ? Instead of setting up your own databases and then replicating/clustering it, use the Datastore API and Google will handle those tasks for you.  Need to store large media files ? Use the Blobstore service.  Need a super fast, temporary data store ? Use Memcache.

The GAE App contract

GAE can make your app work, only if your app adheres to a strict contract with the container. The contract consists of :

1. Programmatic interfaces
2. Programmatic restrictions



Programmatic Interfaces

GAE provides APIs to the services that it offers. You must only use the APIs/services provided by GAE in your GAE app. For example, for persistent data storage, GAE provides a Datastore. You must use the GAE provided Datastore to persist your data. You cannot configure your own MYSQL instance and use it on GAE.

Programmatic Restrictions

GAE places a lot of restrictions on what you can do and cannot do in a GAE App. For example, you cannot read/write to the filesystem. You are also not allowed to spawn your own threads in a GAE App. Some of these make sense. If you were do File IO in your GAE App (File IO being local to the machine), how do you expect GAE to automatically provision servers for you during high load ?

One of the unfortunate consequence of these restrictions is that not all libraries that you use in your regular Java SE or Java EE projects will work. So you cannot use a library that does File IO anywhere (even if the library is not primarily meant for File IO). The library will fail to run on GAE.   The GAE community maintains a large list of libraries that are compatible/incompatible with GAE :  Will it Play in App Engine

Where App Engine Shines

If your application can live with the programming interfaces and restrictions of GAE, then GAE would make an excellent choice as your cloud platform. It can relieve you from a lot of pain related to server management/provisioning/sizing and administration.  GAE is also a pay-per-use platform. This means you do not incur any upfront fees to build on GAE.  GAE offers a significant free quota  that can help you get started with no cost and pay as you scale.

6 comments:

  1. If you want to scale up your webapp what why do you want to save something to the local storage? For caching use memcache, for saving use the distributed storage.

    ReplyDelete
  2. Good introduction. Can't wait for part 2.

    ReplyDelete
  3. Great article... Eagerly waiting for next part.

    ReplyDelete
  4. Great introduction - looking forward to Part 2.

    ReplyDelete
  5. You most certainly can read from the filesystem.

    ReplyDelete