Welcome!

DevOps Journal Authors: Yeshim Deniz, Elizabeth White, Mike Kavis, Liz McMillan, Carmen Gonzalez

Related Topics: DevOps Journal, Java, Linux, AJAX & REA, Cloud Expo, Big Data Journal

DevOps Journal: Blog Post

What If It Is the Network? Dive Deep to Find the Root Cause

How do you identify the real cause behind the network problems?

Modern Application Performance Management (APM) solutions can be tremendously helpful in delivering end-to-end visibility into the application delivery chain: across all tiers and network sections, all the way to the end user. In previous blog posts we showed how to narrow down to various root causes of the problems that the end users might experience. Those issues ranged from infrastructure through application and network, and through the end-user client application or inefficient use of the software. When the problem comes from the end user application, e.g., a Web 2.0 Web site, user experience management (UEM) solutions can offer broad analysis of possible root causes. Similarly, when an APM fault domain algorithm points to the application, the DevOps team can go deep into the actually executed code and database queries to identify the root cause of the problem.

But what do you do when your APM tool points to the network as the fault domain? How do you identify the real cause behind the network problems? Most of the APM tools stop there, forcing the network team to use separate solutions to monitor the actual network packets.

In this article we show how an Application-Aware Network Performance Management (AANPM) suite can be used to not only zero in on the network problems as the fault domain, but also dive deeper to show the actual trace of network packets in the selected context, captured back at the time when the problem happened.

Isolating Fault Domain to the Network
In one of our blog posts we wrote how Fonterra used our APM tools to identify the problem with SAP application used in the milk churn scanning process. The operations team could easily isolate the fault domain to network problems (see Figure 1); they required, however, further analysis to identify the root cause behind that network problem.

Figure 1: The performance report indicates network problems as the fault domain

In some cases information about loss rate or zero window events is enough to successfully find and resolve the problem. In general, finding the root cause may require you to analyze more detailed, packet level views in order to see exactly what is causing this network performance problem. These details can not only help to determine why we experienced packet loss or zero window events, but also whether the problem was gradually ramping up or if there was a sudden flow control blockage, which would indicate congestion.

For example, a number of users start to experience performance degradation of the service and APM points to the network as the fault domain. The detailed, packet-level analysis can show that the whole service delivery process was blocked by failed initial name resolution.

What Really Happened in the Network?
Why is detailed packet-level analysis so important when our AANPM points to the network?

Let's first consider what happens when we determine fault domain with one of the application delivery tiers. The engineers responsible for that application can start analyzing logs or, better, drill down to single transaction execution steps and often isolate the problem to the actual line of code that was causing the whole performance degradation of the whole application.

However, when our AANPM tells us it is the network, there are no logs or code execution steps to drill down to. Unless we can deliver conclusive and actionable evidence in the form of detailed, packet-level analysis, the network team might have a problem determining the root cause and may remain skeptical whether the network is at fault at all.

This is exactly what happened to one of our customers. An APM solution had correctly identified that there was a performance problem with the web server. The reports showed who was affected and where the users affected by that problem were located when the problem was occurring. The system also pointed toward the network as the primary fault domain.

The network team tried to determine the root cause of the problem. They needed packet level data for that. But, capturing all traffic with a network protocol analyzer after the incident happened not only overloaded the IT team with unnecessary data, but eventually turned out to be a hit and miss.

What the team needed were the network packets at the time the problem occurred, and only those few packets that related to the actual communication realizing affected transactions.

Figure 2: You can drill down to analyze captured network packets in the context of given user operations

For Figure 3, and further insight, click here for the full article.

More Stories By Sebastian Kruk

Sebastian Kruk is a Technical Product Strategist, Center of Excellence, at Compuware APM Business Unit.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@DevOpsSummit Stories
When an enterprise builds a hybrid IaaS cloud connecting its data center to one or more public clouds, security is often a major topic along with the other challenges involved. Security is closely intertwined with the networking choices made for the hybrid cloud. Traditional networking approaches for building a hybrid cloud try to kludge together the enterprise infrastructure with the public cloud. Consequently this approach requires risky, deep "surgery" including changes to firewalls, subnets and other modifications to the corporate security infrastructure. Connecting a public cloud to the ...
SYS-CON Events announced today that Serena Software will exhibit at DevOps Summit Silicon Valley, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Serena Software supports DevOps and Continuous Delivery by providing application deployment automation and software release management solutions to replace slow and error-prone manual processes. 2,500 enterprises around the world trust Serena to help them develop and deploy better software.
For better or worse, DevOps has gone mainstream. All doubt was removed when IBM and HP threw up their respective DevOps microsites. Where are we on the hype cycle? It's hard to say for sure but there's a feeling we're heading for the "Peak of Inflated Expectations". What does this mean for the Enterprise? Should they avoid DevOps? Definitely not. Should they be cautious though? Absolutely. The truth is that DevOps and the Enterprise are at best strange bedfellows. The movement has its roots in the tech community's elite. Open source projects and methodologies driven by the alumni of companies ...
SYS-CON Events announced today that Grid Dynamics, the leading provider of scalable eCommerce technology solutions, will exhibit at DevOps Summit Silicon Valley, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Grid Dynamics is a leading provider of open, scalable, next-generation commerce technology solutions for Tier 1 retail. Grid Dynamics has in-depth expertise in commerce technologies, wide involvement in the open source community and a modern, global workforce. Great companies, partnered with Grid Dynamics, gain a sustainable business...
Docker offers a new, lightweight approach to application portability. Applications are shipped using a common container format and managed with a high-level API. Their processes run within isolated namespaces that abstract the operating environment independently of the distribution, versions, network setup, and other details of this environment. This "containerization" has often been nicknamed "the new virtualization." But containers are more than lightweight virtual machines. Beyond their smaller footprint, shorter boot times, and higher consolidation factors, they also bring a lot of new fea...
The 4th International DevOps Summit, co-located with16th International Cloud Expo – being held June 9-11, 2015, at the Javits Center in New York City, NY – announces that its Call for Papers is now open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits, DevOps is correlated with 2...
We are all here because we are sold on the transformative promise of The Cloud. But what good is all of this ephemeral, on-demand infrastructure if your usage doesn't actually improve the agility and speed of your business? How must Operations adapt in order to avoid stifling your Cloud initiative? In his session at DevOps Summit, Damon Edwards, co-founder and managing partner of the DTO Solutions, will highlight the successful organizational, process, and tooling patterns of high-performing companies that have reshaped their Operations to enable the business to get full value from their Clo...
SYS-CON Events announced today that O'Reilly Media has been named “Media Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. O'Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O'Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurring their adoption by amplifying "faint signals" from the alpha geeks who are creating the future. An...
SYS-CON Events announced today that Gigaom Research has been named "Media Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Ashar Baig, Research Director, Cloud, at Gigaom Research, will also lead a Power Panel on the topic "Choosing the Right Cloud Option." Gigaom Research provides timely, in-depth analysis of emerging technologies for individual and corporate subscribers. Gigaom Research's network of 200+ independent analysts provides new content daily that bridges the gap between break...
Sanjeev Sharma is the latest author to join DevOps Journal. Sanjeev is a solution architect and DevOps Worldwide lead with Rational Software, an IBM brand and the author of 'DevOps for Dummies.' DevOps Journal is focused on this critical enterprise IT topic in the world of cloud computing. SYS-CON Media CEO Carmen Gonzalez is founder and publisher of DevOps Journal, and Roger Strukhoff, long-time SYS-CON editor and the conference chair of Cloud Expo is the editor of the world's leading DevOps resource.
SYS-CON Events announced today that IBM is holding a Bluemix Developer Playground on November 5, 10:30 am to 5:30 pm at 15th Cloud Expo. 15th Cloud Expo, co-located with @ThingsExpo, Big Data Expo, and DevOps Summit is taking place Nov. 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. The labs, for developers of all levels, will highlight the ease of use of Bluemix, its services and functionality and provide short-term introductory projects that developers can complete between sessions. Developers will be able spend as much time as they want working on specific DevOps pro...
When you set off to build an app that will change the world, designing your system architecture to be reliable and scalable is important but the stark reality is that, for your MVP, you probably had a “need for speed” (of development). You didn’t know what all the axes were to scale your application, where your stress points would be, and what weird and wonderful ways your customers would use it down the road. In a world of zero-downtime services, landing the plane to figure it out is not an option. In his session at DevOps Summit, Andrew Miklas, CTO of PagerDuty, will share lessons learned ...
SYS-CON Events announced today that SOASTA, the leader in cloud and mobile testing, will exhibit at DevOps Summit Silicon Valley, which will take place November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. SOASTA is the leader in cloud testing. Its web and mobile test automation and monitoring solutions, CloudTest, TouchTest and mPulse, enable developers, QA professionals and IT operations teams to test and monitor users with unprecedented speed, scale, precision and visibility. The innovative product set streamlines test creation, automates provisioning and execution, ...
Founded in 1997, ActiveState is a global leader providing software application development and management solutions. The Company's products include: Stackato, a commercially supported Platform-as-a-Service (PaaS) that harnesses open source technologies such as Cloud Foundry and Docker; dynamic language distributions ActivePerl, ActivePython and ActiveTcl; and developer tools such as the popular Komodo Edit and Komodo IDE. Headquartered in Vancouver, Canada, ActiveState is trusted by customers and partners worldwide, across many industries including telecommunications, aerospace, software, fina...
SYS-CON Events announced today that ElasticBox is holding a Hackathon at DevOps Summit, November 6 from 12 pm -4 pm at the Santa Clara Convention Center in Santa Clara, CA. You can enter as an individual or team of up to 10 developers. A New Star Is Born Every Month! All completed ElasticBoxes will then be sent to a judging panel - 12 winners will be featured on the ElasticBox website in 2015. All entrants will receive five full enterprise licenses for one year + ElasticBox headphones + ElasticBox T-shirt. Winners can also choose to interview with ElasticBox to join one of the fastest growi...
SYS-CON Events announced today that Calm.io has been named “Bronze Sponsor” of DevOps Summit Silicon Valley, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Calm.io is a cloud orchestration platform for AWS, vCenter, OpenStack, or bare metal, that runs your CL tools puppet, Chef, shell, git, Jenkins, nagios, and will soon support New Relic and Docker. It can run hosted, or on premise and provides VM automation / expiry, self-service portals, audit, approvals, and budgeting.
Blue Box has closed a $10 million Series B financing. The round was led by a strategic investor and included participation from prior investors including Voyager Capital and Founders Collective, as well as the Blue Box executive team. This round follows a $4.3 million Series A closed in December of 2012 and led by Voyager Capital. In May of this year, the company announced general availability of its private cloud as a service offering, Blue Box Cloud. Since that release, the company has demonstrated market validation through customer adoption, positive reviews from industry analysts and k...
The speed of product development has increased massively in the past 10 years. At the same time our formal secure development and SDL methodologies have fallen behind. This forces product developers to choose between rapid release times and security. In his session at DevOps Summit, Michael Murray, Director of Cyber Security Consulting and Assessment at GE Healthcare, will examine the problems and present some solutions for moving security in to the DevOps lifecycle to ensure that we get fast AND secure.
SYS-CON Events announced today that Zentera Systems, an industry visionary delivering hybrid-cloud management solutions, will exhibit at DevOps Summit Silicon Valley, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Zentera Systems, Inc.™ is a Silicon Valley based private company, providing a Cloud Federation Platform (CFP) built on a virtualization architecture with patent-pending technology to address virtual network, cloud firewall, data protection and transport automation within and across cloud domains. Zentera is solving the security ...
Software development, like manufacturing, is a craft that requires the application of creative approaches to solve problems given a wide range of constraints. However, while engineering design may be craftwork, the production of most designed objects relies on a standardized and automated manufacturing process. By contrast, much of moving an application from prototype to production and, indeed, maintaining the application through its lifecycle has often remained craftwork. In his session at DevOps Summit, Gordon Haff, senior cloud strategy marketing and evangelism manager at Red Hat, will di...