Welcome!

@DevOpsSummit Authors: Jnan Dash, Liz McMillan, Zakia Bouachraoui, Janakiram MSV, Pat Romanski

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog

@DevOpsSummit: Blog Post

Production: Performance Where It Really Matters By @GrabnerAndi | @DevOpsSummit #APM #DevOps

DevPOps (like DevOps) features fast feedback loops at its core

"Production is where performance matters most, as it directly impacts our end users and ultimately decides whether our software will be successful or not. Efforts to create test conditions and environments exactly like Production will always fall short; nothing compares to production!"

These were the opening lines of my invitation encouraging performance practitioners to apply for the recent WOPR24 (Workshop on Performance and Reliability). Thirteen performance gurus answered the call and contributed to the event by providing their experience reports and participating in the workshops. Special thanks to organizers Eric Proegler and Mais Tawfik! My key takeaway of WOPR24 is that Performance Engineering as we know it is changing, turning away from traditional load testing and toward production and Continuous Integration, with performance the link between Dev and Ops in DevOps. Really, who would have thought?

STOP Being a Load Tester - Become a DevPOps
The most interesting observation for most of us attending the workshop was that the role and focus of a performance engineering team is changing toward continuous but shorter performance tests in continuous integration, operations monitoring, performance engineering in production, and as the link providing metrics-based feedback to the business and engineering. On the last day of the event we coined the term "DevPOps" with the Performance Team as the missing link to ensure that the software performs in production as it was intended to, based on all the work performed in engineering and the performance testing prior to deploy. DevPOps (like DevOps) features fast feedback loops at its core. Feedback loops need to not only flow back from production monitoring to testing, but also all the way back to engineering, to determine how the software is really behaving under actual load conditions. Therefore, the role of the DevPOps team includes a variety of new responsibilities in addition to traditional load testing:

  • Automated Continuous Performance Engineering in CI
  • Shift-Left Performance Metrics into Jenkins, Bamboo and Co
  • Find regressions based on architectural metrics and stop the build
  • Define Monitoring Metrics and Dashboards
  • Find relevant metrics for CxO, Engineering, Biz and Ops
  • Build monitoring infrastructure for both test and production
  • Load and Performance Tests to test stability, scalability and monitoring
  • Run them in production or production like environments
  • Verify monitoring metrics with stakeholders
  • Monitor Production, Compare with Test, Report to Stakeholders
  • Identify regressions between deployments and test environment
  • Communicate and discuss metrics to CxO, Engineering, Biz and Ops
  • Continually optimize deployment configuration
  • Handle peak loads with scaling infrastructure
  • Identify and reduce BOT traffic (which, on average, accounts for about 70% of web traffic)

Automated Continuous Performance Engineering in CI
Based on my personal experience you don't need to execute large scale load tests to find most of the problems that will later result in poor performance or scalability problems. Why? Because they are typically architectural in nature and can be found by executing either Integration, API-tests or a very small scale load test. The number one problem I find is inefficient access to the data store (Database, O/R Mapper, REST Service, Cache) resulting from querying too much data, using too many round trips to obtain the data, or not using optimized queries. Finding a data-driven problem, like the one illustrated here, in which wrong usage of Hibernate caused a feature in the software to execute thousands of individual SQL statements, was identified by looking at the # of SQL Statements executed by the integration test. If the same SQL statement is seen being executed more than once, you likely have a potential scalability issue.

Hook up your integration or API tests with profiling or tracing tools to capture access to your data layer

Once you determine how to capture these details per test, you can use these measure points to identify regressions across builds, as shown here:

Identify changes in code behavior to spot bad code changes as soon as possible

Define Monitoring Metrics and Dashboards
We had one special exercise during WOPR where we divided participants into three teams to create "the perfect dashboard" showing important metrics for three fictitious businesses: eCommerce, SaaS and Enterprise Corporation (aka "Evil Corp" J). Interestingly, we all reached similar conclusions:

  1. High-Level Business Metrics consumable for EVERYONE in the organization
  2. Aggregated Status per team or business unit
  3. More specific dashboards to review further

Click here for the full article.

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@DevOpsSummit Stories
If you are part of the cloud development community, you certainly know about “serverless computing,” almost a misnomer. Because it implies there are no servers which is untrue. However the servers are hidden from the developers. This model eliminates operational complexity and increases developer productivity. We came from monolithic computing to client-server to services to microservices to the serverless model. In other words, our systems have slowly “dissolved” from monolithic to function-by-function. Software is developed and deployed as individual functions – a first-class object and cloud runs it for you. These functions are triggered by events that follow certain rules. Functions are written in a fixed set of languages, with a fixed set of programming models and cloud-specific syntax and semantics. Cloud-specific services can be invoked to perform complex tasks. So for cloud-na...
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex to learn. This is because Kubernetes is more of a toolset than a ready solution. Hence it’s essential to know when and how to apply the appropriate Kubernetes constructs.
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It's clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. That means serverless is also changing the way we leverage public clouds. Truth-be-told, many enterprise IT shops were so happy to get out of the management of physical servers within a data center that many limitations of the existing public IaaS clouds were forgiven. However, now that we've lived a few years with public IaaS clouds, developers and CloudOps pros are giving a huge thumbs down to the...
To enable their developers, ensure SLAs and increase IT efficiency, Enterprise IT is moving towards a unified, centralized approach for managing their hybrid infrastructure. As if the journey to the cloud - private and public - was not difficult enough, the need to support modern technologies such as Containers and Serverless applications further complicates matters. This talk covers key patterns and lessons learned from large organizations for architecting your hybrid cloud in a way that: Supports self-service, "public cloud" experience for your developers that's consistent across any infrastructure. Gives Ops peace of mind with automated management of DR, scaling, provisioning, deployments, etc.
xMatters helps enterprises prevent, manage and resolve IT incidents. xMatters industry-leading Service Availability platform prevents IT issues from becoming big business problems. Large enterprises, small workgroups, and innovative DevOps teams rely on its proactive issue resolution service to maintain operational visibility and control in today's highly-fragmented IT environment. xMatters provides toolchain integrations to hundreds of IT management, security and DevOps tools. xMatters is the primary Service Availability platform trusted by leading global companies and innovative challengers including BMC Software, Credit Suisse, Danske Bank, DXC technology, Experian, Intuit, NVIDIA, Sony Network Interactive, ViaSat and Vodafone. xMatters is headquartered in San Ramon, California and has offices worldwide.