Welcome!

@DevOpsSummit Authors: Liz McMillan, Elizabeth White, Pat Romanski, Jason Bloomberg, Yeshim Deniz

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog

@DevOpsSummit: Blog Feed Post

Metrics and KPIs for Test Environment Stability | @DevOpsSummit #DevOps #APM #Monitoring

How often is an environment unavailable due to factors within your project’s control?

How often is an environment unavailable due to factors within your project’s control? How often is an environment unavailable due to external factors? Are the software and hardware in an environment up to date with the target Production systems? How often do you have to resort to manual workarounds due to an environment?

Metric: Availability and Uptime Percentage
QA and Staging environments seldom require the same level of uptime as Production, but tell that to a team of developers working 24/7 on a project that has an aggressive deadline. As a Test Environment Manager, you know that when a QA system is unavailable, you will get immediate calls from developers and managers.

As a Test Environment Manager, you will also want to understand the root cause of every outage. If you follow a problem management process for Production outages, you should follow a similar process with test environment management. Understanding why an outage happened is critical for communicating with a development team. Very often a QA environment will become unavailable due to a factor far outside the control of a Test Environment Manager. If one team pushes bad code that interrupts the QA process for all teams you need to be able to identify this clearly.

How to Measure Availability and Uptime?
Keep track of system availability with a standard monitoring tool such as Zabbix or Nagios. If your systems are visible to the public internet, you can also use hosted platforms like Pingdom to measure system availability.

Example Metric: Goal for Availability
An uptime of 95% is usually sufficient for a QA or Staging environment.
If your development is limited to a few time zones, you can also further qualify this by only measuring availability during development hours. While your Production availability commitment is often higher that 99% or 99.5%, you don’t have to treat every QA outage as an emergency. But, your developers may have other opinions—95% uptime still allows for eight hours of downtime a week. You may want to aim higher.

How Does This Metric Motivate Concrete Action?
When you measure system availability and make these numbers public, you encourage Test Environment Managers to make a commitment to uptime. This results in fewer obstacles for QA and development, allowing them to deliver software faster. There’s nothing more debilitating to an organization than disruptions in QA and testing. Measuring this metric allows you to encourage movement toward always-available QA systems.

Metric: Mean Time Between Outages
If your system has a 95% availability, then almost seventy-five minutes of downtime is acceptable every day. If your system fails for ten minutes every hour during an eight-hour work day due to a build or deployment, you’ll be creating a QA or Staging environment that has a 5% chance of losing developer and QA confidence. To get an accurate picture of system availability you need to couple an availability percentage metric with your mean time between outages (MTBO).

How to Measure MTBO
If you follow a process that keeps track of outages and strives to understand the root causes of these outages, you’ll develop a database of issues that you can use to derive your MTBO. If you have a monitoring system configured to calculate availability percentages automatically, you can use this same system to record your MTBO.

Example Metric: Goal for MTBO
This depends on your availability goal. The lower your availability goal, the higher your MTBO should be. For example, if you have a 95% uptime commitment then your outages need to be spaced over a day or a week. You might have eight hours of downtime each weekend to perform system upgrades or a nightly build and deploy process that takes about an hour, but what you can’t have is an MTBO of 45–60 minutes. This will mean that QA and Staging systems will be unavailable for a few minutes every hour, which will result in dissatisfied customers.

How does this Metric Motivate Concrete Action?
If your MBTO is very short, this suggests that build and deploy activity from a continuous integration environment is frequently interrupting both Development and QA. If your MBTO is very high, but your availability is very low (95% or lower) this means that you are experiencing multi-hour downtime at least once a day. When you measure MBTO, you encourage your Release Engineers and Test Environment Managers to work together to create build and deployment scripts that don’t affect availability, and you encourage your staff to approach QA and Staging uptime with care. Without this metric, you run the risk of having teams grow complacent with frequent, low-level unavailability as long as they satisfy overall availability metrics.

Metric: Downtime Requirement for a Test Environment Build and Deploy
When software is deployed to any system, there is a natural tendency for disruption
. If new code is being deployed to an application server that server often requires a restart so that new code can be loaded. If a web server such as Apache or Nginx is being reconfigured this often requires a fast restart measured in seconds.

Some of these build and deploy related disruptions can be avoided through the use of load balancers and clusters of machines. On the largest projects, this is essential in both Production as well as Staging and QA systems. An example is a QA system for a large bank’s transaction processing system. There are so many teams that depend on this system to be up and running 24/7 that causing any disruption would run the risk of freezing the QA process across the entire company.

Other build and deploy downtimes are unavoidable. A frequent example is changes to a database schema. Certain changes to tables and indexes require systems to be stopped and rebooted to reach a state where database activity isn’t competing with DDL statements.

The downtime requirement for a given build and deploy to a test environment is a central measure that is directly related to the availability metrics mentioned before in this section.

How to Measure Build/Deploy Downtime
It’s simple: run a build and deployment and keep track of the downtime that falls into the timespan of each build and deploy function. If you have a continuous integration system such as Jenkins or Bamboo, grab the timestamps of the last few builds and look at your monitoring metrics on QA and Staging to see if there is a system impact.

Example Metric: Goal for Build/Deploy Downtime
Your goal for this metric depends on your level of availability
. If you are working on a shared service, your build and deploy downtime requirement should be as close to zero as possible. If you are working on a less critical application, then your build and deploy downtime should be measured in minutes or seconds.

How does this Metric Motivate Concrete Action?
This metric encourages your Release Engineers and Test Environment Managers to drive build and deploy downtime to zero. With the tools available to developers and DevOps professionals it is possible to achieve zero-downtime deployments to QA and Staging systems. Doing this will give your internal customers more confidence in the systems you are delivering.

The post Metrics and KPIs for Test Environment Stability appeared first on Plutora.

Read the original blog entry...

More Stories By Plutora Blog

Plutora provides Enterprise Release and Test Environment Management SaaS solutions aligning process, technology, and information to solve release orchestration challenges for the enterprise.

Plutora’s SaaS solution enables organizations to model release management and test environment management activities as a bridge between agile project teams and an enterprise’s ITSM initiatives. Using Plutora, you can orchestrate parallel releases from several independent DevOps groups all while giving your executives as well as change management specialists insight into overall risk.

Supporting the largest releases for the largest organizations throughout North America, EMEA, and Asia Pacific, Plutora provides proof that large companies can adopt DevOps while managing the risks that come with wider adoption of self-service and agile software development in the enterprise. Aligning process, technology, and information to solve increasingly complex release orchestration challenges, this Gartner “Cool Vendor in IT DevOps” upgrades the enterprise release management from spreadsheets, meetings, and email to an integrated dashboard giving release managers insight and control over large software releases.

@DevOpsSummit Stories
Many companies start their journey to the cloud in the DevOps environment, where software engineers want self-service access to the custom tools and frameworks they need. Machine learning technology can help IT departments keep up with these demands. In his session at 21st Cloud Expo, Ajay Gulati, Co-Founder, CTO and Board Member at ZeroStack, will discuss the use of machine learning for automating provisioning of DevOps resources, taking the burden off IT teams.
SYS-CON Events announced today that Cedexis will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cedexis is the leader in data-driven enterprise global traffic management. Whether optimizing traffic through datacenters, clouds, CDNs, or any combination, Cedexis solutions drive quality and cost-effectiveness.
SYS-CON Events announced today that Enroute Lab will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Enroute Lab is an industrial design, research and development company of unmanned robotic vehicle system. For more information, please visit http://elab.co.jp/.
SYS-CON Events announced today that Mobile Create USA will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Mobile Create USA Inc. is an MVNO-based business model that uses portable communication devices and cellular-based infrastructure in the development, sales, operation and mobile communications systems incorporating GPS capability.
SYS-CON Events announced today that Suzuki Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Suzuki Inc. is a semiconductor-related business, including sales of consuming parts, parts repair, and maintenance for semiconductor manufacturing machines, etc. It is also a health care business providing experimental research for dementia, etc. For more information, visit http://www.e-suzuki.co.jp/en/.
With the rise of DevOps, containers are at the brink of becoming a pervasive technology in Enterprise IT to accelerate application delivery for the business. When it comes to adopting containers in the enterprise, security is the highest adoption barrier. Is your organization ready to address the security risks with containers for your DevOps environment? In his session at @DevOpsSummit at 21st Cloud Expo, Chris Van Tuin, Chief Technologist, NA West at Red Hat, will discuss: The top security risks with containers and how to manage these risks at scale including Images, Builds, Registry, Deployment, Hosts, Network, Storage, APIs, Monitoring/Logging, and Federation.
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of engineers can navigate the Carrier Ecosystem for your IT team acting as an extension of your business, producing a hassle-free experience.
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more information, visit http://www.nihon-micron.co.jp/.
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http://www.ryobi-sol.co.jp/en/.
SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak.co.jp/en/.
Today traditional IT approaches leverage well-architected compute/networking domains to control what applications can access what data, and how. DevOps includes rapid application development/deployment leveraging concepts like containerization, third-party sourced applications and databases. Such applications need access to production data for its test and iteration cycles. Data Security? That sounds like a roadblock to DevOps vs. protecting the crown jewels to those in IT.
SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp empowers global organizations to unleash the full potential of their data to expand customer touchpoints, foster greater innovation and optimize their operations.
Most of the time there is a lot of work involved to move to the cloud, and most of that isn't really related to AWS or Azure or Google Cloud. Before we talk about public cloud vendors and DevOps tools, there are usually several technical and non-technical challenges that are connected to it and that every company needs to solve to move to the cloud. In his session at 21st Cloud Expo, Stefano Bellasio, CEO and founder of Cloud Academy Inc., will discuss what the tools, disciplines, and cultural aspects are that enterprise companies are considering to get to the cloud and eventually transform the way they build software and services.
SYS-CON Events announced today that Interface Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Interface Corporation is a company developing, manufacturing and marketing high quality and wide variety of industrial computers and interface modules such as PCIs and PCI express. For more information, visit http://www.interface-amita.com/aboutus/interface_profile.asp.
SYS-CON Events announced today that Keisoku Research Consultant Co. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Keisoku Research Consultant, Co. offers research and consulting in a wide range of civil engineering-related fields from information construction to preservation of cultural properties. For more information, visit http://www.krcnet.co.jp/eng_site/e_index.htm.
SYS-CON Events announced today that MIRAI Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MIRAI Inc. are IT consultants from the public sector whose mission is to solve social issues by technology and innovation and to create a meaningful future for people.
SYS-CON Events announced today that Fusic will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Fusic Co. provides mocks as virtual IoT devices. You can customize mocks, and get any amount of data at any time in your test. For more information, visit https://fusic.co.jp/english/.
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data and analytics insights onto a single, holistic, display, focusing attention on what matters, when it matters.
Today most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes significant work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reduction in cost and increase in speed. Sometimes in order to reduce complexity teams compromise features or change requirements
21st International Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS – software, platform, and infrastructure as a service.
Agile has finally jumped the technology shark, expanding outside the software world. Enterprises are now increasingly adopting Agile practices across their organizations in order to successfully navigate the disruptive waters that threaten to drown them. In our quest for establishing change as a core competency in our organizations, this business-centric notion of Agile is an essential component of Agile Digital Transformation. In the years since the publication of the Agile Manifesto, the connection between building better software and business agility has been a tenuous one at best. But now that Agile is maturing and Digital Transformation is driving change across enterprises large and small, companies are realizing that their best bet for achieving business agility is to take the best of Agile and apply it across the entire organization.
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, will discuss how data centers of the future will be managed, how the public cloud best suits your organization, and what the future holds for operations and infrastructure engineers in a post-container world. Is a serverless world inevitable?