Welcome!

@DevOpsSummit Authors: Pat Romanski, Zakia Bouachraoui, Liz McMillan, Elizabeth White, Yeshim Deniz

Related Topics: @DevOpsSummit, Microservices Expo, @CloudExpo

@DevOpsSummit: Blog Post

Key Performance Metrics: Part 2 By @GrabnerAndi | @DevOpsSummit [#DevOps]

A look at the set of metrics captured from within the application server as well as the interaction with the database

Key Performance Metrics for Load Tests Beyond Response Time | Part 2

In Part I of this blog I explained which metrics on the Web Server, App Server and Host allow me to figure out how healthy the system and application environment is: Busy vs. Idle Threads, Throughput, CPU, Memory, etc.

In Part 2, I focus on the set of metrics captured from within the application server (#Exceptions, Errors, etc.) as well as the interaction with the database (connection pools, roundtrips to database, amount of data loaded, etc.). Most of the screenshots shown in this blog comes from real performance data shared from our Dynatrace Free Trial users that leveraged my Share Your PurePath program where I helped them analyze the data they captured. I also hope you comment on this blog and share your metrics with the larger performance testing community.

1. Top Database Activity Metrics
The database is accessed by the application. Therefore I capture most of my database metrics from the application itself by looking into executed SQL Statements:

  • Average # SQLs per User Over Time
  • If #SQLs per average user goes up we most likely have a data-driven problem. The more data in the database - the more SQLs we execute
  • Do we cache data, e.g: Search Results? Then this number should not go up but rather down as data should come from the cache.
  • Total # SQL Statements
    • Should at a max go up with number of simulated users
    • Otherwise it is a sign of bad caching or data driven problems.
  • Slowest SQL Statements
    • Are there individual SQLs that can be optimized both on SQL level or in the database?
    • Do we need additional indices?
    • Can we cache result data of some of these heavy statements?
  • SQLs called very frequently
    • Do we have an N+1 Query Problem?
    • Can we cache some of that data if it is requested over and over again?
  • The following screenshot shows a custom dashboard showing the number of database statements executed over time and on average per transaction/user:

    Over time the number of SQLs should go down per end user as certain data should be cached. Otherwise we may have data driven or caching problems.

    The following screenshot shows the my Database Dashboard that provides several different diagnostics option to identify problematic database

    access patterns and slow SQLs:

    Optimize individual SQLs but also reduce the execution of SQLs if results can be cached.

    2. Top Connection Pool Metrics
    Every application uses Connection Pools to access the database. Connection Leaks, holding on too long on connections or not properly sized pools can result in performance problems. Here are my key metrics:

    • Connection Pool Utilization
      • Are the pools properly sized based on the expected load per runtime (JVM, CLR, PHP...)?
      • Are pools constantly exhausted? Do we have a connection leak?
    • Connection Acquisition Time
      • Are we perfectly configured and just need the amount of connections in the pool?
      • Or do we see increasing Acquisition time (time it takes to get a connection from the pool) which tells us we need more connections to fulfill the demand.

    The following screenshot shows a custom dashboard showing JDBC Connection Pool Metrics captured from WebLogic via JMX:

    Are connection pools correctly sized in relation with incoming transactions? Do we have connection leaks?

    The following screenshot shows a Database Dashboard automatically calculating key metrics per connection pool:

    Acquisition Time tells us how long a transaction needs to wait to acquire the next connection from the pool. This should be close to zero.

    Click here for the full article.

    More Stories By Andreas Grabner

    Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

    Comments (0)

    Share your thoughts on this story.

    Add your comment
    You must be signed in to add a comment. Sign-in | Register

    In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


    @DevOpsSummit Stories
    DXWorldEXPO LLC announced today that Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, software-defined solution with rich machine intelligence.
    When building large, cloud-based applications that operate at a high scale, it’s important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. “Fly two mistakes high” is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, will discuss how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
    "DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited to help the great team at @DevOpsSUMMIT and CloudEXPO tell the world how they can leverage this emerging disruptive trend."
    Digital transformation is about embracing digital technologies into a company's culture to better connect with its customers, automate processes, create better tools, enter new markets, etc. Such a transformation requires continuous orchestration across teams and an environment based on open collaboration and daily experiments. In his session at 21st Cloud Expo, Alex Casalboni, Technical (Cloud) Evangelist at Cloud Academy, explored and discussed the most urgent unsolved challenges to achieve full cloud literacy in the enterprise world.
    CloudEXPO | DevOpsSUMMIT | DXWorldEXPO Silicon Valley 2019 will cover all of these tools, with the most comprehensive program and with 222 rockstar speakers throughout our industry presenting 22 Keynotes and General Sessions, 250 Breakout Sessions along 10 Tracks, as well as our signature Power Panels. Our Expo Floor will bring together the leading global 200 companies throughout the world of Cloud Computing, DevOps, IoT, Smart Cities, FinTech, Digital Transformation, and all they entail. As your enterprise creates a vision and strategy that enables you to create your own unique, long-term success, learning about all the technologies involved is essential. Companies today not only form multi-cloud and hybrid cloud architectures, but create them with built-in cognitive capabilities.