Update 10/24/2008: It appears that the mcycle consumption of datastore operations are not actually counted against the soft cap and that GAE’s admin console is incorrectly including the consumption of datastore operations when issuing warnings about excessive mcycle consumption.
Update 12/16/2008: It starting to look like the mcycle consumption values I observed in GAE’s logs were accurate. The GAE team just released a new quota details page that shows both general CPU and datastore CPU and I’m seeing both get consumed pretty heavily by datastore puts. GAE also has not confirmed that issue 814, which tracks the apparent inaccuracy of the logs, is a real issue, but they did resolve issue 786 as a duplicate of 814 right quick.
Update 2/23/2009: Google removed the soft cap (“High CPU Requests” quota) on 2/13.
Well after getting my then current progress on Project Fangorn deployed to Google App Engine and spending some time playing it and fixing a few bugs, I started looking into the performance metrics in the logs. One in particular was literally raising red flags and that was number of megacycles (mcycles) used per request. Getting the overview page was taking around 9000 mcycles and issuing any command was taking anywhere from 12000 to 21000 depending on the command.
App Engine has a soft cap of 1000 mcycles. Its a soft cap as the first request that goes over isn’t kill by the infrastructure. Google hasn’t been specific about exactly when it kicks in other then that requests consuming more then 1000 mcycles can do it.
I started investigating the serving of the overview page first as that is the most frequently requested page. After some code restructuring and making use of Google’s memcache API I was able to get down from 9000 mcycles to an average of 368 mcycles. Good deal.
Next I started looking into where all of the mcycle consumption was on processing game commands. Turns out nearly all of it is GAE datastore put (i.e. save) calls. Putting the absolute simplest entity takes 715 mcycles, putting an entity with just 4 properties is taking 864 mcycles, and putting an entity with a list of 40 strings each of 14 characters in length takes 2140 mcycles. That’s only the time to put the entity. If you are doing an arithmetic update to an existing entity you need to get it first before putting it to ensure you are doing the math on the current state of the data. The net results if you have an entity with more then handful or properties to update you’ll consistently be over the 1000 mcycle cap when processing user input and you can forget about updating two entities during on request.
I’ve raised the issue on the GAE discussion group here and here and I’ve also created issue 786 in their issue database to track this issue, but so far no response from any GAE folks. For the time being I’m halting development on Project Fangorn, as I need to be able update multiple entities when processing a user’s command without GAE temporarily ceasing to serve my application.