Welcome!

@DevOpsSummit Authors: Yeshim Deniz, Zakia Bouachraoui, Pat Romanski, Liz McMillan, Elizabeth White

Related Topics: @DevOpsSummit, Java IoT, Microservices Expo, Linux Containers

@DevOpsSummit: Blog Post

Configuration Drift: The Cost of Complexity

Imagine this — you're rolling out a new version of your web app

Imagine this — you're rolling out a new version of your web app. Works great in the dev environment, and it's been signed off on in staging, so it gets rolled out to production. Things seem fine, so you call it a night.

Then the support requests begin flooding in. Something's broken somewhere, and it's not immediately obvious how. Performance monitor shows the machines are running well, so it can't be that. Ah well, better crack one of those neon-colored energy drinks, it's time to roll back and log into these machines to look through logs and config files for a potential cause. "How could this be happening," you ask, "I mean... these machines are all configured the same, right?"

costofcomplexity

Often, that's wrong.

Configuration drift is a very real and increasingly common problem, especially in growing environments. In a way, you can call it the "hidden cost of complexity," and there are a number of causes behind it.

  • Well-meaning team members could've updated something to a new version, installed a conflicting package or service, or applied a fix thought to be minor.
  • Software or OS updates applied here but not there could've thrown everything out of whack.
  • A tiny change in a far-flung config file could be the metaphorical butterfly that flapped its wings.
  • Changing settings or firmware on a network device may affect some or all clients connected through it.
  • A machine could've been compromised in a way that isn't obvious.
  • Space aliens.

And as wildly varied as the causes can be, the potential effects are even worse. We're talking downtime, failed infrastructure, loss of data, loss of business, and even loss of customer trust.

One reason the lurking configuration drift problem isn't more widely discussed in IT probably has a great deal to do with the wide variation in its causes and effects-something with a thousand possible causes and a thousand possible effects is difficult to pin down as one phenomenon. It's not as easy to define and fight as, say, viruses or hardware failure. Viruses are things we can point to and say, "These are bad, here's how they proliferate, and here's how you protect yourself," and as for hardware failure, we all know what that looks like and know how to mitigate it when it happens.

Another reason for not discussing config drift is probably that-until recently-there hasn't been a single solution for preventing or dealing with it.

GuardRail directly combats configuration drift by continually scanning and monitoring your configs across practically every platform and device. It's a robust, collaborative platform with tools to graphically identify differences and potential hazards, and alert you when something goes awry. Reports can be exported to PDF for auditing or compliance purposes, and configs you verify as good can be exported to Chef, Docker, Ansible, and Puppet for automation.

And when we say "collaborative," we mean it. We designed GuardRail from the ground-up to be simple enough to be a valuable tool for every stakeholder. Nodes and their differences are represented graphically, in an easy-to-navigate interface that's useful no matter your background.

Don't believe it's possible? We'd be happy to give you the grand tour and show you a live demo running on real devices. Or check out the product page and get started right away.

Read the original blog entry...

More Stories By ScriptRock Blog

ScriptRock makes GuardRail, a DevOps-ready platform for configuration monitoring.

Realizing we were spending way too much time digging up, cataloguing, and tracking machine configurations, we began writing our own scripts and tools to handle what is normally an enormous chore. Then we took the concept a step further, giving it a beautiful interface and making it simple enough for our bosses to understand. We named it GuardRail after its function — to allow businesses to move fast and stay safe.

GuardRail scans and tracks much more than just servers in a datacenter. It works with network hardware, Cloud service providers, CloudFlare, Android devices, infrastructure, and more.

@DevOpsSummit Stories
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Addteq is a leader in providing business solutions to Enterprise clients. Addteq has been in the business for more than 10 years. Through the use of DevOps automation, Addteq strives on creating innovative solutions to solve business processes. Clients depend on Addteq to modernize the software delivery process by providing Atlassian solutions, create custom add-ons, conduct training, offer hosting, perform DevOps services, and provide overall support services.
Contino is a global technical consultancy that helps highly-regulated enterprises transform faster, modernizing their way of working through DevOps and cloud computing. They focus on building capability and assisting our clients to in-source strategic technology capability so they get to market quickly and build their own innovation engine.
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addresses many of the challenges faced by developers and operators as monolithic applications transition towards a distributed microservice architecture. A tracing tool like Jaeger analyzes what's happening as a transaction moves through a distributed system. Monitoring software like Prometheus captures time-series events for real-time alerting and other uses. Grafeas and Kritis provide security polic...
DevOpsSUMMIT at CloudEXPO will expand the DevOps community, enable a wide sharing of knowledge, and educate delegates and technology providers alike. Recent research has shown that DevOps dramatically reduces development time, the amount of enterprise IT professionals put out fires, and support time generally. Time spent on infrastructure development is significantly increased, and DevOps practitioners report more software releases and higher quality. Sponsors of DevOpsSUMMIT at CloudEXPO will benefit from unmatched branding, profile building and lead generation opportunities.