Welcome!

@DevOpsSummit Authors: Pat Romanski, Zakia Bouachraoui, Liz McMillan, Elizabeth White, Yeshim Deniz

Related Topics: @DevOpsSummit, Java IoT, Microservices Expo, Linux Containers, Machine Learning , Agile Computing

@DevOpsSummit: Blog Post

Don’t Trust Your Log Files | @DevOpsSummit [#DevOps]

Most exceptions are handled by your code or by the frameworks your app uses

Don’t Trust Your Log Files: How and Why to Monitor All Exceptions

I would say that only one out of a million exceptions thrown in an application actually makes it to a log file - unless you run your application in verbose logging mode - Do you agree? No? Here is why I think that is: because most exceptions are handled by your code or by the frameworks your app uses. Here is a chart from an enterprise application showing that there are about 4000x more custom application exception objects thrown than important log messages written:

4000 times more Exceptions than log messages: Can they be ignored? What's their impact?

Why worry about these exceptions that nobody cares to write to a log file? Two reasons:

  1. They are typically thrown for a good reason and therefore indicate a problem, e.g., configuration issues in frameworks or runtime problems
  2. Every Exception object is a potential performance problem because it means the JVM needs to allocate memory, get the stack trace and dispose the object soon after

Reason #1: Configuration Problems
The following shows a transaction where the method getImagePath makes a web service call to a back-end server using HttpClient. getImagePath uses an HTTP Endpoint URL. The Web Service however only supports HTTPS (SSL). The web service call therefore fails with an SSLException. getImagePath retries three times until it gives up and just returns a default value to the caller. No log entry written, no exception thrown to the caller, everything seems okay to the outside world even though we have a severe impact on an end user who is waiting longer than necessary for an image that he doesn't get to see:

Exceptions are highlighting configuration problems (wrong URL) but the calling method is not doing anything with that information

Key Takeaways:

  • End Users: This code is executed for every user that executes this request and none of them will get the correct image path. Additionally, the user is waiting on it for several seconds. We all know what users will do if they have to wait too long.
  • Business: If your app delivers dynamic user-specific content, e.g., recommendations for that user, you need to ensure that no configuration problem causes your app to deliver incorrect content. As business owner you want to get alerted when a problem in the app causes incorrect responses to your users.
  • Operations: When users complain, there is no documented evidence of a problem (nothing in a log file). Make sure to monitor outgoing web requests and the status of these calls as this helps you to identify if you have requests that start failing or not delivering what they are supposed to deliver.
  • Developers: Everything probably worked well when they tested this web service in their own environment where they used a dummy or mocked web service endpoint. Make sure to add log for these situations and let Operations know how to configure these endpoints.

For Reason #2, and further insight, click here for the full article.

More Stories By Andreas Grabner

Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@DevOpsSummit Stories
DXWorldEXPO LLC announced today that Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, software-defined solution with rich machine intelligence.
When building large, cloud-based applications that operate at a high scale, it’s important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. “Fly two mistakes high” is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, will discuss how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
"DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited to help the great team at @DevOpsSUMMIT and CloudEXPO tell the world how they can leverage this emerging disruptive trend."
Digital transformation is about embracing digital technologies into a company's culture to better connect with its customers, automate processes, create better tools, enter new markets, etc. Such a transformation requires continuous orchestration across teams and an environment based on open collaboration and daily experiments. In his session at 21st Cloud Expo, Alex Casalboni, Technical (Cloud) Evangelist at Cloud Academy, explored and discussed the most urgent unsolved challenges to achieve full cloud literacy in the enterprise world.
CloudEXPO | DevOpsSUMMIT | DXWorldEXPO Silicon Valley 2019 will cover all of these tools, with the most comprehensive program and with 222 rockstar speakers throughout our industry presenting 22 Keynotes and General Sessions, 250 Breakout Sessions along 10 Tracks, as well as our signature Power Panels. Our Expo Floor will bring together the leading global 200 companies throughout the world of Cloud Computing, DevOps, IoT, Smart Cities, FinTech, Digital Transformation, and all they entail. As your enterprise creates a vision and strategy that enables you to create your own unique, long-term success, learning about all the technologies involved is essential. Companies today not only form multi-cloud and hybrid cloud architectures, but create them with built-in cognitive capabilities.