Welcome!

@DevOpsSummit Authors: Jnan Dash, Liz McMillan, Zakia Bouachraoui, Janakiram MSV, Pat Romanski

Related Topics: @DevOpsSummit, Microservices Expo, @CloudExpo

@DevOpsSummit: Blog Post

Key Performance Metrics By @GrabnerAndi | @DevOpsSummit [#APM #DevOps]

Capture and analyze the metrics from the different application tiers and components in your application

Key Performance Metrics for Load Tests Beyond Response Time | Part I

Whether it is JMeter, SoapUI, Load Runner, SilkTest, Neotys or one of the cloud-based load testing solutions such as Keynote, Dynatrace (formerly Gomez) or others, breaking an application under heavy load is easy these days. Finding the problem based on automatically generated load testing reports is not. Can you tell me what is wrong based on the following reports?

Load Testing Reports alone are showing you that there is a problem - but not necessarily where you should look next

My Key Metrics from Web Server to Database
I've helped engineering organizations over the last 10 - 15 years to either run or analyze load tests. In this blog post I share my best practices and metrics I typically look when analyzing load testing results. I am not relying on the out-of-the box load testing reports, but instead I extend them based on the tools and capabilities, or put in an APM tool such as Dynatrace to capture this type of data while the load testing tool drives the load.

Some of the technical product screenshots in this blog are taken from data users of our Dynatrace Free Trial who shared data through my Share Your PurePath program. Thanks to all of them.

Now - if a load testing task is coming up for you I hope you find most of my described steps useful as I believe it will make analyzing your results easier. Feel free to use Dynatrace (or any other APM tool if you already have such a tool) in order to capture and analyze the following metrics from the different application tiers and components in your application:

From web server to database there are key performance metrics to look at instead of spending too much time in the load testing report

Now - let me go into the details of these metrics, where to capture them from, and what they tell us. In this blog I focus on the Web Server, Application Server, Hosts and the Application Layers. The next blog will focus on the Database as well as Errors and Logging.

1. Top Web Server Metrics
On the Web Server (Apache, IIS, Nginx, ...) the following key metrics have proven extremely valuable to identify problems in your deployment:

  • Busy and Idle Threads
    • Do you need more worker threads per web server?
    • Do you need more web servers?
    • Are threads busy for too long because of application performance hotspots?
  • Throughput
    • How many transactions / minute can we handle?
    • When do we need to scale out and add more web servers?
  • Bandwidth Requirements
    • Is the network the bottleneck?
    • Are our average pages too heavy?
    • Can we offload content to CDNs?

For example below we have a Web Server Process Health Dashboard- showing all of the metrics that are key for me. They get captured through a module placed in the Web Server:

Key metrics from your web server: worker threads, throughput and bandwidth

2. Top App Server Metrics
On the application server (Java, .NET, PHP) I focus on the following key metrics to identify any deployment or configuration problems on your application servers:

  • Load Distribution
    • How many transactions are handled by each JVM/CLR/PHP engine?
    • Are they equally load balanced?
    • Do we need more application servers to handle the load?
  • CPU Hotspots
    • How much CPU is needed for this tested load?
    • Is high CPU caused by bad programming and can be fixed?
    • Or do we need more CPU power?
  • Worker Threads
    • Is the number of worker threads correctly configured?
    • Are worker threads busy because the application servers are not ready?
    • Are there any web server modules that block these threads?
  • Memory Issues
    • Do we see bad memory patterns? Do we have a memory leak?
    • What's the impact of garbage collection on CPU and transaction throughput?

The following screenshot shows my Process Health Dashboard. All data is automatically captured via an agent and injected in your Java, .NET, PHP or node.js engine:

Key metrics from your app server: worker threads, CPU, memory and throughput

For insight on hosts and the application layers click here for the full article

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@DevOpsSummit Stories
If you are part of the cloud development community, you certainly know about “serverless computing,” almost a misnomer. Because it implies there are no servers which is untrue. However the servers are hidden from the developers. This model eliminates operational complexity and increases developer productivity. We came from monolithic computing to client-server to services to microservices to the serverless model. In other words, our systems have slowly “dissolved” from monolithic to function-by-function. Software is developed and deployed as individual functions – a first-class object and cloud runs it for you. These functions are triggered by events that follow certain rules. Functions are written in a fixed set of languages, with a fixed set of programming models and cloud-specific syntax and semantics. Cloud-specific services can be invoked to perform complex tasks. So for cloud-na...
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex to learn. This is because Kubernetes is more of a toolset than a ready solution. Hence it’s essential to know when and how to apply the appropriate Kubernetes constructs.
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It's clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. That means serverless is also changing the way we leverage public clouds. Truth-be-told, many enterprise IT shops were so happy to get out of the management of physical servers within a data center that many limitations of the existing public IaaS clouds were forgiven. However, now that we've lived a few years with public IaaS clouds, developers and CloudOps pros are giving a huge thumbs down to the...
To enable their developers, ensure SLAs and increase IT efficiency, Enterprise IT is moving towards a unified, centralized approach for managing their hybrid infrastructure. As if the journey to the cloud - private and public - was not difficult enough, the need to support modern technologies such as Containers and Serverless applications further complicates matters. This talk covers key patterns and lessons learned from large organizations for architecting your hybrid cloud in a way that: Supports self-service, "public cloud" experience for your developers that's consistent across any infrastructure. Gives Ops peace of mind with automated management of DR, scaling, provisioning, deployments, etc.
xMatters helps enterprises prevent, manage and resolve IT incidents. xMatters industry-leading Service Availability platform prevents IT issues from becoming big business problems. Large enterprises, small workgroups, and innovative DevOps teams rely on its proactive issue resolution service to maintain operational visibility and control in today's highly-fragmented IT environment. xMatters provides toolchain integrations to hundreds of IT management, security and DevOps tools. xMatters is the primary Service Availability platform trusted by leading global companies and innovative challengers including BMC Software, Credit Suisse, Danske Bank, DXC technology, Experian, Intuit, NVIDIA, Sony Network Interactive, ViaSat and Vodafone. xMatters is headquartered in San Ramon, California and has offices worldwide.