Archive

Archive for October, 2008

Now is the time to build

October 30, 2008 12 comments
Big Cats

Big Cats by micalngelo

“Every startup CEO is at least thinking about the need to cut back right now” – Michael Arrington

“We simply attempt to be fearful when others are greedy and to be greedy only when others are fearful.” – Warren Buffet

I’ll give you one guess as to which man I’m listening to. So no, not every startup CEO is cutting back. Apple spent their time innovating during the last downturn and look where it got them. I’m thrilled to have just passed out big, healthy profit-sharing bonuses to all of our employees this week for the 5th consecutive year. We think and hope they’ll be even bigger next year.

SmugMug was founded in the middle of the last “nuclear winter” in Silicon Valley. Everyone told us we were crazy, and we knew there was no chance at raising venture capital at a decent valuation, even with our impressive backgrounds. So we did what any good entrepreneur would do: We did it anyway, with both eyes firmly on our business model.

So if you’re running a startup, or thinking of creating one, take heart – downturns are a fabulous time to build and grow businesses. Focus on your revenues and your margins, not your growth rate or # of unique visitors. Find some stable income streams and a customer need. Listen to your customers and give them what they want – and what they’re willing to pay for. And take care of your employees – they’re your most valuable asset.

SmugMug is still hiring Sorcerers, Heroes, and all manner of other mythical beings capable of impossible feats. We filled our last position (quickly, I might add) with a *great* hire (and I’m still sorting through the avalanche of resumes we got to see if we can add a few more), but the job door is never closed at SmugMug for true superstars. Our philosophy is to not let anyone amazing get away, even if we don’t technically have an open position for you.

So if you can make magic and want to work for a company that takes crazy-good care of its employees, let us know.

Huge EC2 release: Load Balancing & Auto-Scaling!

October 27, 2008 7 comments
June 5th, 2008 near Maryville, Missouri

June 5th, 2008 near Maryville, Missouri by Shane Kirk

In case you didn’t see it, Amazon had a huge EC2 announcement the other day that included:

  • EC2 is now out of beta.
  • EC2 has a SLA!
  • Windows is now availabled on EC2
  • SQL Server is now available on EC2

But the really cool bits, if you ask me, are the announcements about the next wave of related services:

  • Monitoring
  • Load Balancing
  • Auto-Scaling
  • A web-based management console

As frequent readers of my blog and/or conference talks will know, this means one of the last important building blocks to creating fully cloud-hosted applications *at scale* is nearly ready for primetime.

For those keeping score at home, my personal checklist shows that the only thing now missing is a truly scalable, truly bottomless database-like data store. Neither Elastic Block Storage (EBS) nor SimpleDB really solve the entire scope of the problem, though they’re great building blocks that do solve big pieces (or everything, at smaller scale). I’m positive that someone (Amazon or other) will solve this problem and I can start moving more stuff “to the Cloud”.

I can’t wait.

Live-tweeting Cloud keynote at PDC 2008

October 27, 2008 1 comment
UFO OR CLOUD?

UFO OR CLOUD? by Shane Kirk

Microsoft is announcing some exciting Cloud Computing stuff today at their Professional Developers Conference (PDC). Assuming it’s the same stuff (and more?) I’ve been briefed on over the last year, it’s pretty exciting stuff.

I’ll be live-tweeting the best bits over on my Twitter account. If this stuff is interesting to you, come check it out.

Feedburner hiccup, sorry about that.

October 13, 2008 2 comments

For some reason, Feedburner’s feed of my blog broke over the weekend. Not sure why, but I think I fixed it. Apologies for everyone who’s a few days lagged with my latest posts in their favorite blog reader – it was me, not you.

Categories: personal

Amazon S3: Price reduction

October 13, 2008 6 comments

I know a lot of you get your Amazon Web Services news from me, so I thought I’d better mention this one. It’s huge!! 🙂

Amazon announced S3 price reductions as you scale. For us, since we’re way beyond 500TB, this is huge. And for any of you who are still in their first tier, it’s something to look forward to. 🙂

DevPay also got a significant new release, pricing-wise, recently, so if you’re interested in that, better check it out.

Thanks Amazon!

Categories: amazon Tags: , , , ,

Canon 5D MkII footage is back up!

October 13, 2008 6 comments

Pulitzer Prize-winning photographer Vincent Laforet’s awesome Canon 5D MkII film, Reverie, is once again hosted at SmugMug in all its HD glory. I believe it’s only up for this week or something and then we have to take it down again, so you’d better go watch it while you have the chance. 🙂

See it auto-sized for your screen & browser or view it in Hi-Def. Your choice.

Don’t forget to check out the behind the scenes footage, too, also auto-sized for you or in full Hi-Def.

Enjoy!

ZFS & MySQL/InnoDB Compression Update

October 13, 2008 26 comments
Network.com setup in Vegas, Thumper disk bay, green by Shawn Ferry

Network.com setup in Vegas, Thumper disk bay, green by Shawn Ferry

As I expected it would, the fact that I used ZFS compression on our MySQL volume in my little OpenSolaris experiment struck a chord in the comments. I chose gzip-9 for our first pass for a few reasons:

  1. I wanted to see what the “best case” compression ratio was for our dataset (InnoDB tables)
  2. I wanted to see what the “worst case” CPU usage was for our workload
  3. I don’t have a lot of time. I need to try something quick & dirty.

I got both those data points with enough granularity to be useful: a 2.12X compression ratio over a large & varied dataset, and the compression was fast enough to not really be noticeable for my end users. The next step, obviously, is to find out what the best ratio of compression and CPU is for our data. So I spent the morning testing exactly that. Here are the details:

  • Created 11 new ZFS volumes (compression = [none | lzjb | gzip1-9])
  • Grabbed 4 InnoDB tables of varying sizes and compression ratios and loaded them in the disk cache
  • Timed the time (using ‘ptime’) it took to read the file from cache and write it to disk (using ‘cp’), watching CPU utilization (using ‘top’, ‘prstat’, and ‘mpstat’)

It quickly became obvious that there’s relatively little difference in compression between gzip-1 and gzip-9 (and, contrary to what people were saying in the comments, relatively little difference between CPU usage, either, in 3 of the 4 cases. The other case, though… yikes!). So I quickly stopped even doing anything but ‘none’, ‘lzjb’, ‘gzip-1’, and ‘gzip-9’. (LZJB is the default compression for ZFS – gzip-N was added later as an option).

Note that all the files were pre-cached in RAM before doing any of the tests, and ‘iostat’ verified we were doing zero reads. Also note that this is writing to two DAS enclosures with 15 x 15K SCSI disks apiece (28 spindles in a striped+mirrored configuration) with 512MB of write cache apiece. So these tests complete very quickly from an I/O perspective because we’re either writing to cache (for the smaller files) or writing to tons of fast spindles at once (the bigger files). In theory, this should mean we’re testing CPU more than we’re testing our IO – which is the whole point.

I ran each ‘cp’ at least 10 times, letting the write cache subside each time, selecting the fastest one as the shown result. Here they are (and be sure to read the CPU utilization note after the tables):

TABLE1
compression size ratio time
uncompressed 172M 1 0.207s
lzjb 79M 2.18X 0.234s
gzip-1 50M 3.44X 0.24s
gzip-9 46M 3.73X 0.217s

Notes on TABLE1:

  • This dataset seems to be small enough that much of time is probably spent in system internals, rather than actually reading, compressing, and writing data, so I view this as only an interesting size datapoint, rather than size and time. Feel free to correct me, though. 🙂
TABLE2
compression size ratio time ratio
uncompressed 631M 1 1.064s 1
lzjb 358M 1.76X 0.668 1.59X
gzip-1 253M 2.49X 1.302 0.82X
gzip-9 236M 3.73X 11.1s 0.10X

Notes on TABLE2:

  • gzip-9 is massively slower on this particular hunk of data. I’m no expert on gzip, so I have no idea why this would be, but you can see the tradeoff is probably rarely worth it, even if were using precious storage commodities (say, flash or RAM rather than hard disks). I ran this one extra times just to make sure. Seems valid (or a bug).
TABLE3
compression size ratio time ratio
uncompressed 2675M 1 15.041s 1
lzjb 830M 3.22X 5.274 2.85X
gzip-1 246M 10.87X 44.287 0.34X
gzip-9 220M 12.16X 52.475 0.29X

Notes on TABLE3:

  • LZJB really shines here, performance wise. It delivers roughly 3X faster performance while also chewing up roughly 3X less bytes. Awesome.
  • gzip’s compression ratios are crazy great on this hunk of data, but the performance is pretty awful. Definitely CPU-bound, not IO-bound.
TABLE4
compression size ratio time ratio
uncompressed 2828M 1 17.09s 1
lzjb 1814M 1.56X 14.495s 1.18X
gzip-1 1384M 2.04X 48.895s 0.35X
gzip-9 1355M 2.09X 54.672s 0.31X

Notes on TABLE4:

  • Again, LZJB performs quite well. 1.5X bytes saved while remaining faster. Nice!
  • gzip is again very obviously CPU bound, rather than IO-bound. Dang.

There’s one other very important datapoint here that ‘ptime’ itself didn’t show – CPU utilization. On every run with LZJB, both ‘top’ and ‘mpstat’ showed idle CPU. The most I saw it consume was 70% of the aggregate of all 4 CPUs, but the average was typically 30-40%. gzip, on the other hand, pegged all 4 CPUs on each run. Both ‘top’ and ‘mpstat’ verified that 0% CPU was idle, and interactivity on the bash prompt was terrible on gzip runs.

Some other crazy observations that I can’t explain (yet?):

  • After a copy (even to an uncompressed volume), ‘du’ wouldn’t always show the right bytes. It took time (many seconds) before showing the right # of bytes, even after doing things like ‘md5sum’. I have no idea why this might be.
  • gzip-9 made a smaller file (1355M vs 1380M) on this new volume as opposed to my big production volume (which is gzip-9 also). I assume this must be due to a different compression dictionary or something, but it was interesting.
  • Sometimes I’d get strange error messages trying to copy a file over an existing one (removing the existing one and trying again always worked):

    bash-3.2# ptime cp table4.ibd /data/compression/gzip-1
    cp: cannot create /data/compression/gzip-1/table4.ibd: Arg list too long
  • After running lots of these tests, I wasn’t able to start MySQL anymore. It crashed on startup, unable to allocate enough RAM for InnoDB’s buffer pool. (You may recall from my last post that MySQL seems to be more RAM limited under OpenSolaris than Linux). I suspect that ZFS’s ARC might have sucked up all the RAM and was unwilling to relinquish it, but I wasn’t sure. So I rebooted and everything was fine. 😦

Conclusion? Unless you care a great deal about eking out every last byte (using a RAM disk, for example), LZJB seems like a much saner compression choice. Performance seem to improve, rather than degrade, and it doesn’t hog your CPU. I’m switching my ZFS volume to LZJB right now (on-the-fly changes – woo!) and will copy all my data so it gets the new compression settings. I’ll sacrifice some bytes, but that’s ok – performance is king. 🙂

Also, my theory that I’d always have idle CPU with modern multi-core chips so compression wouldn’t be a big deal seems to be false. Clearly, with gzip, it’s possible to hog your entire CPU if you’re doing big long writes. We don’t tend to do high-MB/s reads or writes, but it’s clearly something to think about. LZJB seems to be the right balance.

So, what should I test next? I wouldn’t mind testing compression latencies on very small reads/writes more along the lines of what our DB actually does, but I don’t know how to do that in a quick & dirty way like I was able to here.

Also, I have to admit, I’m curious about the different checksum options. Has anyone played with anything other than the default?

%d bloggers like this: