Welcome!

@DevOpsSummit Authors: Jason Bloomberg, Pat Romanski, Yeshim Deniz, Elizabeth White, Liz McMillan

Related Topics: @DevOpsSummit, Linux Containers, @CloudExpo

@DevOpsSummit: Blog Feed Post

DevOps: Apollo Mission Control | @DevOpsSummit @ToddVernon #DevOps

The pattern between the world of space mission operations and the evolution of SaaS businesses is converging

DevOps and Apollo Mission Control
By Todd Vernon

Lately I have been reading the excellent book Digital Apollo. It explores the evolution of digital control systems and the man-machine interface that evolved during the development of space flight and ultimately the Apollo missions. It’s a fantastic book – more technical than most – but very approachable to those not familiar with flight control, embedded software, or the challenges of building such systems. As I read the book, I could not help but compare the way space missions were executed to that of the role of DevOps in modern SaaS businesses.

Apollo cover

I started my career at NASA testing digital flight controls for an experimental aircraft the X-29. The X-29 flight test program was just the latest in the series of one-off aircrafts that started with the Bell X-1 and moved to the X-15 that laid the groundwork for Apollo. As a result, flight test was executed in a very similar fashion in all these programs. Nearly every switch, surface, actuator, probe was instrumented and that data was downlinked in real-time to a control room as the airplane or spacecraft flew.

As the vehicles became more fly-by-wire and had digital computers at their core, those computers also downlinked a lot of their internal state variables to the ground where teams of engineers could keep track of every button push, flight mode, acceleration in real-time, helping the pilot look for things that could happen to potentially end the mission or end his life.

The pattern between the world of space mission operations and the evolution of SaaS businesses is converging. While generally no one dies if your SaaS service fails to operate, the implication of downtime every year gets more and more real. If you operate a platform that services customers that collectively pay millions of dollars a day for your product or service, that is serious business.

Like state variables downlinked from Apollo, we now watch the equivalent using tools like New Relic as our systems support millions or billions of customer transactions through the services we have built. While Apollo’s AGC had to work for several hundred hours at a time, our SaaS services get turned on once when we launch our company and the mission goes on forever. As a result, we are replacing rooms of engineers there for days with systems that connect them to the technology all the time.

Modern monitoring tools are starting to approach the quality of observation we had back at NASA for immediacy of data, but at the same time, now far surpass those relatively crude tools for the spontaneity of exploration and discovery. Today, I get an alert on my iPhone when some part of our system is acting inconsistent and I can interact with our engineers in real-time regardless of location.

Like the rooms of engineers that supported an Apollo mission, today we are on the verge of supporting our complex systems with a virtual room of engineers using tools like VictorOps. As systems become more complex, it becomes more likely the problem needs to be solved by the person that wrote the code in the first place. Very often only that person has (or ever had) the knowledge of how the system works with such intimacy as to know how to fix it or work around it to keep the mission (business) functioning.

apollo_14_lm

On Apollo 14, an engineer noticed while the space craft was in Lunar orbit that the software bit, buried deep in the guidance and navigation computer inside the Lunar Module (LM), signified that the descent program abort was initiated. This was caused by a loose bead of solder that effectively kept “pushing” the abort button and was not a problem or even noticed by the crew as the descent program that would land the astronauts on the moon was not running yet.

Had that program been initiated, as was scheduled only minutes later, the mission would have been aborted and quite likely the crew would have been lost. The knowledge of that specific engineer who knew how the system would misbehave was enabled by the ability to be connected to software through advanced monitoring, and the ability to act on that data in real-time. If you removed any part of the equation, Apollo 14 would have been much different.

This is the basic DNA of how we look at our product at VictorOps. We connect engineers to the mission-critical machines that run your business. If you expect the unexpected and outfit your teams accordingly, you can be ready to respond to any problem faster and more accurately then your competition.

The post DevOps and Apollo Mission Control appeared first on VictorOps.

More Stories By VictorOps Blog

VictorOps is making on-call suck less with the only collaborative alert management platform on the market.

With easy on-call scheduling management, a real-time incident timeline that gives you contextual relevance around your alerts and powerful reporting features that make post-mortems more effective, VictorOps helps your IT/DevOps team solve problems faster.

@DevOpsSummit Stories
"Our strategy is to focus on the hyperscale providers - AWS, Azure, and Google. Over the last year we saw that a lot of developers need to learn how to do their job in the cloud and we see this DevOps movement that we are catering to with our content," stated Alessandro Fasan, Head of Global Sales at Cloud Academy, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Andi Mann, Chief Technology Advocate at Splunk, is an accomplished digital business executive with extensive global expertise as a strategist, technologist, innovator, marketer, and communicator. For over 30 years across five continents, he has built success with Fortune 500 corporations, vendors, governments, and as a leading research analyst and consultant.
Hardware virtualization and cloud computing allowed us to increase resource utilization and increase our flexibility to respond to business demand. Docker Containers are the next quantum leap - Are they?! Databases always represented an additional set of challenges unique to running workloads requiring a maximum of I/O, network, CPU resources combined with data locality.
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to the new world.
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, provided an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereum.