Welcome!

@DevOpsSummit Authors: Pat Romanski, Roger Strukhoff, Liz McMillan, Elizabeth White, Derek Weeks

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog

@DevOpsSummit: Blog Feed Post

Break Down the Silos: Correlate Data Between Vendors | @DevOpsSummit #DevOps #APM #Monitoring

The complexity of modern infrastructure makes it difficult to avoid silos

Break Down the Silos: Correlate Data Between Vendors
By Chris Riley

Thanks to the DevOps movement, we now understand why software delivery chains that consist of a series of silos are bad. They complicate communication between different teams, leading to delivery delays, backtracking, and bugs.

When it comes to incident management, there is another type of silo to contend with - the kind that separates incident management data from one vendor or product to another. These silos hamper incident resolution, as it makes it more difficult to collect and analyze monitoring data from multiple sources.

How do you break down these silos to keep incident management operations flowing efficiently?

Identify the Silos
The first step in working past incident management silos is to understand why silos exist in the first place.

The reason is simple: Modern infrastructure consists of diverse hardware and software. Most components have special monitoring needs. They output information in a certain format, according to a certain rhythm, and they require data to be collected in a certain way. The monitoring information associated with each part of the infrastructure, therefore, lives in a silo, because it is not readily comparable to data from other parts of the infrastructure.

As a basic example, take a datacenter that consists of ten bare-metal servers running Windows and another ten bare-metal servers that run Linux. In this scenario, the company would require different monitoring tools for its Windows and Linux servers. Although some of the monitoring information for each type of operating system (such as whether the host is up) would be the same, other data would not be. And either way, the data would need to be collected by tools that are compatible with the operating system in question. Each context, therefore, becomes a distinct silo, with its own miniature ecosystem of monitoring tools and data.

This is just a simple example, by the way. Things are much more complicated in most real-world settings, when you would have not just two different types of bare-metal servers to monitor, but virtual servers running on top of one or more types of hypervisors, workstations running different types of desktop operating systems, and mobile devices powered by a widely varying array of mobile operating systems, versions, and so on.

Break Down Silos
How do you eliminate the silos that separate each monitoring context within your infrastructure so that you get seamless and holistic monitoring visibility? The solution has two parts.

Step 1: Centralize Data Collection
The first step is to implement an incident management solution that can collect information from diverse types of environments, then forward that information to a central location. This way, engineers can monitor the entire infrastructure from a single vantage point. They don't need to go looking inside individual silos to monitor different parts of the infrastructure.

Centralized data collection requires an incident management solution that is smart enough to aggregate monitoring information from multiple sources. This is no trivial task; supporting a wide range of environments and endpoints requires integration with many different types of monitoring systems, sometimes even custom tooling.

Step 2: Translate the Data
The second step is one that is easy to overlook. In addition to aggregating data from many monitoring tools and exposing it in a central location, incident management teams also need to translate all of the data into a consistent format.

Data translation is the only way to assure that every engineer is able to interpret and react to alerts from any source. If data is not translated, engineers would have to have special expertise in a particular type of monitoring system or know a certain vendor's schema, in order to understand data that originated from that system. Making all of the data available in a central location would, therefore, be of little help in breaking down silos, because there would still be tall barriers separating different monitoring contexts.

Consider, for example, the different ways in which Zabbix and Nagios use the term "alias." On the former monitoring system, an alias basically serves as a shorthand for any type of configuration term. On Nagios, in contrast, an alias is a given name for a host. Its meaning is more specific. If you don't understand this difference and you see data from both Zabbix and Nagios systems aggregated in a centralized dashboard, things can easily get confusing.

For effective incident management then, you need a solution that can translate vendor- and platform-specific terminology into a single, consistent language. Only with event normalization, such as that enabled by the PagerDuty Common Event Format, can responders easily and accurately interpret data from multiple sources.

The complexity of modern infrastructure makes it difficult to avoid silos. Yet, that does not mean that monitoring information has to live within those silos, as information is only useful when it can be understood and acted upon. By aggregating monitoring information from diverse sources and translating it into a language that anyone on the on-call team can understand, incident management teams can break down the silos that exist within their infrastructure. They will then enjoy seamless communication and agile, real-time response to incidents.


Dunatov, Devin. "Speeding." Jul 17, 2012. Online image. <https://www.flickr.com/photos/ddunatov/7588797542>

The post Break Down the Silos: Correlate Data Between Vendors appeared first on PagerDuty.

Read the original blog entry...

More Stories By PagerDuty Blog

PagerDuty’s operations performance platform helps companies increase reliability. By connecting people, systems and data in a single view, PagerDuty delivers visibility and actionable intelligence across global operations for effective incident resolution management. PagerDuty has over 100 platform partners, and is trusted by Fortune 500 companies and startups alike, including Microsoft, National Instruments, Electronic Arts, Adobe, Rackspace, Etsy, Square and Github.

@DevOpsSummit Stories
SYS-CON Events announced today that Synametrics Technologies will exhibit at SYS-CON's 22nd International Cloud Expo®, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Synametrics Technologies is a privately held company based in Plainsboro, New Jersey that has been providing solutions for the developer community since 1997. Based on the success of its initial product offerings such as WinSQL, Xeams, SynaMan and Syncrify, Synametrics continues to create and hone innovative products that help customers get more from their computer applications, databases and infrastructure. To date, over one million users around the world have chosen Synametrics solutions to help power their accelerated business and personal computing needs.
Cloud Expo | DXWorld Expo have announced the conference tracks for Cloud Expo 2018. Cloud Expo will be held June 5-7, 2018, at the Javits Center in New York City, and November 6-8, 2018, at the Santa Clara Convention Center, Santa Clara, CA. Digital Transformation (DX) is a major focus with the introduction of DX Expo within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throughout enterprises of all sizes.
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, which can process our conversational commands and orchestrate the outcomes we request across our personal and professional realm of connected devices.
Continuous Delivery makes it possible to exploit findings of cognitive psychology and neuroscience to increase the productivity and happiness of our teams. In his session at 22nd Cloud Expo | DXWorld Expo, Daniel Jones, CTO of EngineerBetter, will answer: How can we improve willpower and decrease technical debt? Is the present bias real? How can we turn it to our advantage? Can you increase a team’s effective IQ? How do DevOps & Product Teams increase empathy, and what impact does empathy have on productivity?
DevOps promotes continuous improvement through a culture of collaboration. But in real terms, how do you: Integrate activities across diverse teams and services? Make objective decisions with system-wide visibility? Use feedback loops to enable learning and improvement? With technology insights and real-world examples, in his general session at @DevOpsSummit, at 21st Cloud Expo, Andi Mann, Chief Technology Advocate at Splunk, explored how leading organizations use data-driven DevOps to close their feedback loops to drive continuous improvement.
As many know, the first generation of Cloud Management Platform (CMP) solutions were designed for managing virtual infrastructure (IaaS) and traditional applications. But that's no longer enough to satisfy evolving and complex business requirements. In his session at 21st Cloud Expo, Scott Davis, Embotics CTO, explored how next-generation CMPs ensure organizations can manage cloud-native and microservice-based application architectures, while also facilitating agile DevOps methodology. He explained how automation, orchestration and governance are fundamental to managing today's hybrid cloud environments and are critical for digital businesses to deliver services faster, with better user experience and higher quality, all while saving money.
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol, we have been able to solve many of these problems at the communication layer. This makes it possible to create rich application experiences and support use-cases such as mobile-to-mobile communication and large file transfers that would be difficult or cost-prohibitive with traditional networking.
Digital transformation is about embracing digital technologies into a company's culture to better connect with its customers, automate processes, create better tools, enter new markets, etc. Such a transformation requires continuous orchestration across teams and an environment based on open collaboration and daily experiments. In his session at 21st Cloud Expo, Alex Casalboni, Technical (Cloud) Evangelist at Cloud Academy, explored and discussed the most urgent unsolved challenges to achieve full cloud literacy in the enterprise world.
You know you need the cloud, but you're hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You're looking at private cloud solutions based on hyperconverged infrastructure, but you're concerned with the limits inherent in those technologies. What do you do?
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering management. To date, IBM has launched more than 50 cloud data centers that span the globe. He has been building advanced technology, delivering “as a service” solutions, and managing infrastructure services for the past 20 years.
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. That means serverless is also changing the way we leverage public clouds. Truth-be-told, many enterprise IT shops were so happy to get out of the management of physical servers within a data center that many limitations of the existing public IaaS clouds were forgiven. However, now that we’ve lived a few years with public IaaS clouds, developers and CloudOps pros are giving a huge thumbs down to the ...
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the benefits of the cloud without losing performance as containers become the new paradigm.
The past few years have brought a sea change in the way applications are architected, developed, and consumed—increasing both the complexity of testing and the business impact of software failures. How can software testing professionals keep pace with modern application delivery, given the trends that impact both architectures (cloud, microservices, and APIs) and processes (DevOps, agile, and continuous delivery)? This is where continuous testing comes in. D
SYS-CON Events announced today that Evatronix will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Evatronix SA offers comprehensive solutions in the design and implementation of electronic systems, in CAD / CAM deployment, and also is a designer and manufacturer of advanced 3D scanners for professional applications.
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex to learn. This is because Kubernetes is more of a toolset than a ready solution. Hence it’s essential to know when and how to apply the appropriate Kubernetes constructs.
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS – software, platform, and infrastructure as a service.
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS – software, platform, and infrastructure as a service.
DevOps at Cloud Expo – being held June 5-7, 2018, at the Javits Center in New York, NY – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits, DevOps is correlated with 20% faster time-to-market, 22% improvement in quality, and 18% reduction in dev and ops costs, according to research firm Vanson-Bourne. It is changing the way IT works, how businesses interact with customers, and how organizations are buying, building, and delivering software.
All clouds are not equal. To succeed in a DevOps context, organizations should plan to develop/deploy apps across a choice of on-premise and public clouds simultaneously depending on the business needs. This is where the concept of the Lean Cloud comes in - resting on the idea that you often need to relocate your app modules over their life cycles for both innovation and operational efficiency in the cloud.
@DevOpsSummit at Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, is co-located with 22nd Cloud Expo | 1st DXWorld Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
SYS-CON Events announced today that T-Mobile exhibited at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on quality and value. Based in Bellevue, Washington, T-Mobile US provides services through its subsidiaries and operates its flagship brands, T-Mobile and MetroPCS. For more information, visit https://www.t-mobile.com.
SYS-CON Events announced today that Cedexis will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Cedexis is the leader in data-driven enterprise global traffic management. Whether optimizing traffic through datacenters, clouds, CDNs, or any combination, Cedexis solutions drive quality and cost-effectiveness. For more information, please visit https://www.cedexis.com.
SYS-CON Events announced today that Google Cloud has been named “Keynote Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Companies come to Google Cloud to transform their businesses. Google Cloud’s comprehensive portfolio – from infrastructure to apps to devices – helps enterprises innovate faster, scale smarter, stay secure, and do more with data than ever before.
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In their session at 21st Cloud Expo, Jenny Hung, E2E Engineer Manager at Yahoo Gemini, Haoran Zhao, Software Engineer at Oath Gemini, and Lin Zhang, Software Engineer at Oath (Yahoo), will describe the technical challenges and the principles we followed to build a reliable and scalable test automation infrastructure across desktops, mobile apps, and mobile web platforms on the cloud. We also share some...