In high school, I had a great programmable calculator. I’d program it to solve complicated math and science problems “automatically” for me. Most of my teachers got upset if they found out, but I’ll always remember one especially enlightened teacher who didn’t. He said something to the effect of “Hey, if you managed to write software to solve the equation, you must thoroughly understand the problem. Way to go!”.
George Reese wrote up a blog post over at O’Reilly the other day called On Why I Don’t Like Auto-Scaling in the Cloud. His main argument seems to be that auto-scaling is bad and reflects poor capacity planning. In the comments, he specifically calls SmugMug out, saying we’re “using auto-scaling as a crutch for poor or non-existent capacity planning”.
George is like one of those math teachers who doesn’t “get it”. I was tempted not to write this post because he gets it so wrong, I’d hate to spread that meme. SkyNet auto-scales well. No humans at SmugMug are monitoring it and it just hums along, doing its job. Why is it so efficient? Because I understand the equation. I know what metrics drive our capacity planning and I programmed SkyNet to take these into account. It checks an awful lot of data points every minute or so – this isn’t simply “oh, we have idle CPU, let’s kill some instances.” (I would argue that, depending on the application, simple auto-scaling based on CPU usage or similar data point can be very effective, too, though).
SkyNet has been in production for over a year with only two incidents of note and SmugMug has more than doubled in size and capacity during that time without adding any new operations people. How on earth is this a bad thing?
June 5th, 2008 near Maryville, Missouri by Shane Kirk
In case you didn’t see it, Amazon had a huge EC2 announcement the other day that included:
- EC2 is now out of beta.
- EC2 has a SLA!
- Windows is now availabled on EC2
- SQL Server is now available on EC2
But the really cool bits, if you ask me, are the announcements about the next wave of related services:
- Load Balancing
- A web-based management console
As frequent readers of my blog and/or conference talks will know, this means one of the last important building blocks to creating fully cloud-hosted applications *at scale* is nearly ready for primetime.
For those keeping score at home, my personal checklist shows that the only thing now missing is a truly scalable, truly bottomless database-like data store. Neither Elastic Block Storage (EBS) nor SimpleDB really solve the entire scope of the problem, though they’re great building blocks that do solve big pieces (or everything, at smaller scale). I’m positive that someone (Amazon or other) will solve this problem and I can start moving more stuff “to the Cloud”.
I can’t wait.