Skip to main content

AppEngine Cold Startups - Rendezvous with Google App Engine Team

I was at Google App Engine IRC Chat today. Following a lot of concerns of others and of my own, I wanted to put forth my ideas to them about cold startups. I tried to promote my ideas of different approach to classloading, osgi, profiling the runtime itself etc. But time was not enough. I will probably talk again next time. For the most part, It looks like Google is almost ready to come up with *reserved instances* approach. But here is what they have to say.

(9:59:52 PM) sar4j: This is about cold startups for java. I run my website sarathonline.com on appegnine py. But I am a java developer. So I have to get reasonabl undersanding about how gae/j works. To simulate the same performance, I am adding a test.js script to exisitng website. So every time a visit comes to gae/py a hit happens to gae/j. I still see that gae/j is very aggressively spinned down. (given number of requests are the same) is there any algorithm on how and when spinning down happens?
.....
(10:04:17 PM) apijason_google: sar4j: We're definitely aware of the issues that Java developers are facing with cold startup times, especially since many Java frameworks exaggerate the loading time. We have reserved instances on the roadmap, which should allow you to pay to keep a minimum number of app instances warm, and we're working on other general improvements to increase performance. Right now, the only general way to ensure that your application is not
(10:04:18 PM) apijason_google: cycled out is to have steady traffic. However, this should be genuine traffic -- our systems know when you're just using a cron job to keep your app alive.
(10:05:14 PM) cgrinds: apijason_google, knows and doesn't keep it hot?
(10:05:56 PM) ikai_google: cgrinds: Not necessarily, but we discourage using cron jobs
(10:06:36 PM) sar4j: apijason_google: So you advice me to try a full blown move to gae/java for a few days and see? Will your algorithm then spin down the app at longer intervels?
(10:06:37 PM) ikai_google: cgrinds: *using cron jobs purely as a mechanism to keep applications loaded
(10:06:51 PM) cgrinds: ikai_google, yep understood
(10:08:25 PM) prencher: apijason_google: consider this to me my lobbying request for you to lobby that it should be hot spares, not minimum instances (minimum instances just reduces the problem - hot spares eliminate it, mostly)
(10:08:38 PM) apijason_google: sar4j: If you have steady traffic, then your application should stay loaded. Are you using the Python runtime already?
(10:09:11 PM) sar4j: yes. it is much more realtime - than my study on gae/j
(10:09:38 PM) sar4j: my apps : sar (py) and sarjava
(10:10:12 PM) apijason_google: prencher: Understood. I think the work for this is largely underway already, but I'll pass on your request.
(10:10:18 PM) sar4j: so now if I want to move to java, I am a little hesitant only because of cold startups.
(10:11:37 PM) apijason_google: sar4j: With steady, real-time traffic, your Java app should perform just fine and shouldn't be cycled out. I'm interested in hearing your impressions if you decide to deploy a Java app vs. your already deployed Python app.
(10:11:39 PM) ikai_google: sar4j: You should base your solution on which language or tool fits you best. For many developers, this is Python. It sounds like you've got code written already
......
(10:13:56 PM) Wesley_google: sar4j: we're curious... what are the diffs in startup times b/w your Py vs. Java apps?
...
(10:19:11 PM) sar4j: Well, (steady, real-time traffic, ) is a little abstract. I get real traffic (regular search generated traffic, etc about 1 hit evry 10 min or less). For my blog and site (they run resources and dynamic content off of sar.appspot, the python version - right now). The python runtime give sub 2 sec response consistently, and most times sub second. While different versions of java apps (with, without spring, slim based, and ust servlet based) all spin down agressively and give 3-12 secs of response time EVERY time. they some times even time out. I understand what I am currently doing is as good as running a cron job. But I want to benchmark before I move to java. So, my question is, If this traffic can be considered steady, real-time traffic, and that my app will spin down less (not like live forever - but atleast not go down for almost every request) then I am willing to go for it.
(10:22:15 PM) sar4j: Also I noticed that the time taken after the webapp gets the hit (real time of response) is very little(in msecs for bare minimum app and about 1-2 secs for a spring app). Its the runtime bootup that takes a most of the time.
..
(10:22:39 PM) apijason_google: sar4j: In that case, I would probably stick with your Python version, at least until reserved instances are available. At 1 hit every 10 minutes, it's likely that your Python app is getting spun down too. But the extra complexity and weight of the JVM means that Python instances spin up faster than Java instances, even discounting any frameworks you might be using.
(10:23:29 PM) sar4j: apijason_google:thanks. I guess I will do the same.
(10:25:15 PM) apijason_google: Five more minutes left -- get your questions in! :)
(10:25:26 PM) sar4j: However, If I may ask, I would like to know, how the jetty run time work. Does it create a seperate java process everytime spindown/up happens or just the webapp is turned off and on?
(10:26:24 PM) sar4j: What takes the most time for the runtime? the classloading, or the file io for reading the jars, etc. do y have any such profiling info?
(10:27:03 PM) sar4j: the appstats gives profiling AFTER the runtime is up,
(10:28:20 PM) frew_google: sar4j: The same process is reused sometimes (you can see this experimentally by checking whether or not your global variables are still around from last time).
(10:28:55 PM) yodler12: Just wanted to comment that I am very excited for whatever upcoming new features you'll be announcing at IO. Having you google nerds working on that stuff reminds me why I chose App Engine - it's almost like having your own Googlers working for you!
(10:29:22 PM) sar4j: I ask because, if its the classloading or the library over head that is slowing down the runtime's bootup, may be you should think in the direction of OSGi.. most libraries are same. if an OSGi model is adopted, the runtime bootup takes less time and less CPU
(10:29:38 PM) apijason_google: sar4j: We don't have any public profiling info. currently. Maybe we can put this together at some point.
(10:29:41 PM) sar4j: saves green and saves the hassle of reserved instances.
....
(10:32:45 PM) frew_google: sar4j: You might want to look at http://www.answercow.com/2010/03/google-app-engine-cold-start-guide-for.html (not written by a Googler, and I haven't had a chance to verify the numbers, but the methodology of adding/removing libraries and checking cold start times is pretty sound if you're worried about class loading times).
....
(10:35:23 PM) apijason_google: OK Everyone, thanks for a great session. It's past 10:00, so today's office hour is officially complete, though a few of us may stick around for a few more minutes. The next chat session is in two weeks, Wednesday, May 5th, from 7:00-8:00 p.m. PST. Cheers!
....
(10:36:36 PM) sar4j: thanks frew_google = I went even further. I unjarred libraries and put them directly under classes to check. But I still got a range of response times (2-8) secs. If you can access my logs, you can look at sarjava logs. different versions. the cpu time hangs around 1000msec but the total response time varies drastically.

Popular posts from this blog

Powered By

As it goes, We ought to give thanks to people who power us. This page will be updated, like the version page , to show all the tools, and people this site is Powered By! Ubuntu GIMP Firebug Blogger Google [AppEngine, Ajax and other Apis] AddtoAny Project Fondue jQuery

Decorator for Memcache Get/Set in python

I have suggested some time back that you could modularize and stitch together fragments of js and css to spit out in one HTTP connection. That makes the page load faster. I also indicated that there ways to tune them by adding cache-control headers. On the server-side however, you could have a memcache layer on the stitching operation. This saves a lot of Resources (CPU) on your server. I will demonstrate this using a python script I use currently on my site to generate the combined js and css fragments. So My stitching method is like this @memize(region="jscss") def joinAndPut(files, ext): res = files.split("/") o = StringIO.StringIO() for f in res: writeFileTo(o, ext + "/" + f + "." + ext) #writes file out ret = o.getvalue() o.close() return ret; The method joinAndPut is * decorated * by memize. What this means is, all calls to joinAndPut are now wrapped (at runtime) with the logic in memize. All you wa...

Faster webpages with fewer CSS and JS

Its easy, have lesser images, css and js files. I will cover reducing number of images in another post. But If you are like me, You always write js and css in a modular fashion. Grouping functions and classes into smaller files (and Following the DRY rule, Strictly!). But what happens is, when you start writing a page to have these css and js files, you are putting them in muliple link rel=style-sheet or script tags. Your server is being hit by (same) number of HTTP Requests for each page call. At this point, its not the size of files but the number server roundtrips on a page that slows your page down. Yslow shows how many server roundtrips happen for css and js. If you have more than one css call and one js call, You are not using your server well. How do you achieve this? By concatinating them and spitting out the content as one stream. So Lets say I have util.js, blog.js and so.js. If I have a blog template that depends on these three, I would call them in three script tags. Wh...