Been asked a few times in the last few days about where my slides are from my MySQL keynote from *last* year.
Um, yeah. Sorry about that. Here’s a link to ‘The SmugMug Tale’ slides, and you can watch the video below:
Sorry for the extreme lag. I suck.
The important highlights go something like this:
- Use transactional replication. Without it, you’re dead in the water. You have no idea where a crashed slave was.
- Use a filesystem that lets you do snapshots. Easily the best way to do backups, spin up new slaves, etc. I love ZFS. You’ll need transactional replication to really make this painless.
- Use SSDs if you can. We can’t afford to be fully deployed on SSDs (terabytes are expensive), but putting them in the write path to lower latency is awesome. The read path might help, too, depending on how much caching you’re already doing. Love hybrid storage pools.
- Use Fishworks (aka Open Storage) if you can. The analytics are unbeatable, plus you get SSDs, snapshots, ZFS, and tons of other goodies.
- Use transactional replication. This is so important I’m repeating it. Patch it into MySQL (Google, Facebook, and Percona have patches) or use XtraDB if you use replication. We use the Percona patch.
Holler in the comments if something in the presentation isn’t clear, I’ll answer. Apologies again.
Shameless plug – we’re hiring. And it’s a blast.
I know I’m a little late to the discussion, but Brian Aker posted a thought-provoking piece on the imminent death of MySQL replication to scale reads. His premise is that memcached is so cool and scales so much better, that read replication scaling is going to become a think of the past. Other MySQL community people, like Arjen and Farhan, chimed in too.
Now, I love memcached. We use it as a vital layer in our datacenters, and we couldn’t live without it. But it’s not a total solution to all reads, so at least for our use case, it’s not going to kill our replica slaves that we use to scale reads.
Why? Because we still need to do index lookups to get the keys that we can extract from memcached. And we have to do lots of those indexed queries. Most of the row data lives inside of memcached, so this turns out to be a great solution, but we still need read slaves to provide the lists of keys. Bottom line is that we still use read replication heavily – but we use it for different things that we did in years past.
And then, of course, there’s the issue of memcached failure. For us, it’s very rare, and thanks to the way memcached works, it rarely hampers system performance, but when a node fails and needs to be re-filled, we have to go back to disk to get it. And doing that efficiently means read slaves again.
For us, memcached plus MySQL replication is true magic. Brian’s a very smart guy, and I realize he wrote the post to get people thinking and talking about the issue, but at least for us, read slaves are here to stay. 🙂