5 Tips to Avoid Configuration Error Proliferation in the Cloud
The cloud is an impartial accelerator. It delivers your business outcomes faster, but as enterprise IT is discovering, the cloud also has the potential to magnify operational mistakes and process problems, said Shawn Edmondson, vice president of product strategy at rPath, an application engine provider for the private cloud.
"Configuration management is a perfect case in point," Edmondson said. "Problems that are manageable on physical servers and even in virtualized clusters become intractable at cloud scale."
Edmondson offers five tips on how to avoid configuration error proliferation in the cloud.
Automation discipline - Basic automation is an easy sell, but many practical implementations — even first-class, massive implementations like Amazon EC2— often leave a few aspects of configuration unautomated. And as the recent EC2 outage demonstrates, one fat-fingered manual change is all it takes to inflict major damage on your business. Ninety-five percent automation may have been good enough for virtualized data centers, but cloud calls for 100 percent automation.
Version control - Recently, Google operations demonstrated casual excellence in their response to an outage. Twenty-three minutes after a failure, things were back to normal, because Google delivers every configuration change under version control. Just as software development has learned (the hard way) that version control is critical, cloud operations is learning that it's not enough to manage current state alone. Modern automation is fully version-controlled and supports rollback for all changes.
Hybrid approach - Configuration is a hot topic in management software right now with open-source and open-core solutions such as Puppet and Chef competing to deliver all-purpose configuration management. But just as there is no one-size-fits-all language or architecture for application software, there is no one-size-fits-all approach to configuration. Automation experts employ a broad toolkit of low-level configuration techniques for different problems, from Puppet manifests for Linux NIC configuration to Powershell scripts for SQL Server configuration.
Model-driven configuration - "Model-driven" is code for "know your destination." Here's a simple analogy. Written driving directions are useless if you wander off the prescribed route, but a street address (plus a GPS) guides you to your destination from anywhere on the continent. Non-model-driven configuration is a long sequence of changes that make implicit, critical assumptions about the target environment, and cloud scale multiplies the ways those assumptions can fail. Directions that work fine for one system can drive a slightly different system into the lake. Model-driven config starts with a clean, human-readable description of desired state and automates the steps to get there, and that works at any scale. (Provided, of course, you have a GPS, a model-driven configuration solution.)
Tightly constrained parameterization – Even a simple software system, such as a plain-vanilla LAMP stack, has thousands of tweakable configuration settings. So some configuration and CMDB tools, in a quest for completeness, attempt to discover, record and manage all of those settings. That's hard to scale in traditional data centers and simply out of the question for cloud. Instead, effective automation makes a clear distinction between static and dynamic config. Static config settings are identical across many systems, so efficient automation systems don't try manage those settings individually; they make compliance straightforward and scalable by blasting in version-controlled config files. But dynamic config, those few settings that really must vary from system to system, should be parameterized, individually designated, carefully version-controlled and kept to a bare minimum.
In traditional data centers, these techniques are effective, sensible and often overkill given the long list of competing IT priorities. But in cloud data centers, these techniques are the new best practice.