Image Processing - 2004, Flickr
was using ImageMagick for image processing (version 6.1.9) - Changed to GraphicsMagick, about 15% faster at the time (version 1.1.5)

Image Processing - 2004, Flickr
was using ImageMagick for image processing (version 6.1.9) - Changed to GraphicsMagick, about 15% faster at the time (version 1.1.5) - Only need a subset of ImageMagick features anyway for our purposes

Image Processing - OpenMP support
(http://en.wikipedia.org/wiki/Openmp) - Allows parallelization of processing jobs, using multiple cores working on the same image - Some algorithms have more parallelization than others

Why? As infrastructure grows, try
to keep the Humans:Machines ratio from getting out of hand Some of the How: - teach machines to build themselves - teach machines to watch themselves

Why? As infrastructure grows, try
to keep the Humans:Machines ratio from getting out of hand Some of the How: - teach machines to build themselves - teach machines to watch themselves - teach machines to ﬁx themselves

Why? As infrastructure grows, try
to keep the Humans:Machines ratio from getting out of hand Some of the How: - teach machines to build themselves - teach machines to watch themselves - teach machines to ﬁx themselves - reduce MTTR by streamlining

Automated Infrastructure - If there
is only one thing you do, automatic conﬁguration and deployment management should be it. - See: - Opscode/Chef (http://opscode.com/) - Puppet (http://reductivelabs.com/products/puppet/) - System Imager/Conﬁgurator (http://wiki.systemimager.org)

Self-Healing Make service monitoring ﬁx
common failure scenarios, notify us later about it. Daemons/processes run on machines, will take corrective action under certain conditions, and report back with what they did.

Self-Healing Make service monitoring ﬁx
common failure scenarios, notify us later about it. Daemons/processes run on machines, will take corrective action under certain conditions, and report back with what they did.

Self-Healing Make service monitoring ﬁx
common failure scenarios, notify us later about it. Daemons/processes run on machines, will take corrective action under certain conditions, and report back with what they did. Can greatly reduce your mean time to recovery (MTTR)

Self-Healing Make service monitoring ﬁx
common failure scenarios, notify us later about it. Daemons/processes run on machines, will take corrective action under certain conditions, and report back with what they did. Can greatly reduce your mean time to recovery (MTTR)

MySQL Self-Healing Some MySQL Issues
“ﬁxed” by the machines - Kill long-running SELECT queries (marked safe to kill) - Queries not safe to kill are marked by the application as “NO KILL” in comments

MySQL Self-Healing Some MySQL Issues
“ﬁxed” by the machines - Kill long-running SELECT queries (marked safe to kill) - Queries not safe to kill are marked by the application as “NO KILL” in comments - Run EXPLAIN on killed queries, and report the results

MySQL Self-Healing Some MySQL Issues
“ﬁxed” by the machines - Kill long-running SELECT queries (marked safe to kill) - Queries not safe to kill are marked by the application as “NO KILL” in comments - Run EXPLAIN on killed queries, and report the results - Keep track of the query types and databases that need the most killing, produce a “DBs that Suck” report

Communications • Internal IRC -
For ongoing discussions - Logged, so “inﬁnite” scrollback • IM Bot (built on libyahoo2.sf.net) - For production changes - Broadcasts all to all contacts - Logged, and injected into IRC - IM Status = who is in primary/secondary on-call • All of IRC and IM Bot slurped into a search index

Morals of Our Stories -
Optimizations can be a Very Good Thing™ - Weigh time spent optimizing against expected gains

Morals of Our Stories -
Optimizations can be a Very Good Thing™ - Weigh time spent optimizing against expected gains - Lean on others for how much “expected gains” mean for different scenarios

Morals of Our Stories -
Optimizations can be a Very Good Thing™ - Weigh time spent optimizing against expected gains - Lean on others for how much “expected gains” mean for different scenarios - Plain old-fashioned intuition