Welcome!

@DevOpsSummit Authors: Elizabeth White, Liz McMillan, Pat Romanski, Mehdi Daoudi, Angsuman Dutta

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog

@DevOpsSummit: Blog Feed Post

A Coordinated Response Culture for Incident Management | @DevOpsSummit #APM #DevOps

Organizational culture that prioritizes coordinated response to incidents is vital for monitoring & managing IT infrastructure

A Coordinated Response Culture for Incident Management
By Christopher Tozzi

An organizational culture that prioritizes coordinated response to incidents is vital for monitoring and managing an IT infrastructure. Incident management won’t go smoothly if teams don’t want to or know how to coordinate their response to alerts.

What is a Coordinated Response?
To break it down simply, a coordinated response in the IT world generally involves notifying the right people and mobilizing teams immediately, providing access to contextual information for seamless alignment, and getting the team onto the right conference bridge or communication channel of choice. A well coordinated response enables organizations to jump in and resolve incidents quickly and efficiently.

Let’s take a look at some of the challenges to response coordination, then examine strategies and tools that allow organizations to overcome these challenges and optimize incident management.

Coordination Challenges
A coordinated response culture rarely breeds itself within an organization. By default, there are obstacles in place that make coordinated response difficult. The biggest challenges include:

  • RECRUITMENT: The difficulty of recruiting additional people to help manage an incident.This challenge arises when you need to bring others in to help put out a fire, but the additional people you need didn’t receive the original alert. Unless you have a incident management platform like PagerDuty, there is no efficient way to request help from others when an incident occurs. You could email or call, of course, and hope the people respond quickly, but emails and calls are not always the fastest way to get in touch or the best way to get someone’s attention quickly, especially outside of normal business hours
  • COMMUNICATION: Too many communication channels give too many options. Multiple communication channels are available for incident management, from email to video chats to Slack. Depending on the type of incident at hand, one channel may make more sense than another. What you don’t want to do is waste time in the midst of an incident figuring out which channel to use and making sure all your team members are on it.
  • TOOLS: Most organizations have some mode of coordination and collaboration. To create an effective coordination culture, it’s best to work with the tools already on hand, and integrating them with a central incident management platform.

Increasing Coordination
Fortunately, these challenges can be solved easily enough by taking advantage of the features very recently released in PagerDuty. With Response Mobilizer and Response Bridge organizations can:

  • RECRUIT: Recruit additional teammates to help solve a particular incident. The best way to do this is to build the recruitment of additional help into your incident management workflow. That’s better than relying on manual, ad-hoc ways of asking for help in the midst of an incident, and you’ll know you have the experts you need there to help you.
  • COMMUNICATE: Integrate existing and most preferred communication tools with your incident management workflow; whether it be email, SMS, Slack, existing conference bridge from WEBEX, GoToMeeting or Skype. Trying to impose a new communication tool on the team will often disrupt the established workflow and incur resistance from staff. The better approach is to integrate your existing communication tools into your incident management solution.
  • HAVE CONTEXT: Include right contextual information to address business-critical issues in real-time. Providing rich contextual information about the incident and including a brief message to responders detailing why they are needed enables responders to prepare and immediately be aligned.

If you’re seeking to build a better-coordinated response culture for incident management within your organization, take advantage of the new features to make the most of the collaboration and coordination resources available to you, without forcing your team to overhaul its communication workflow.

The post A Coordinated Response Culture for Incident Management appeared first on PagerDuty.

Read the original blog entry...

More Stories By PagerDuty Blog

PagerDuty’s operations performance platform helps companies increase reliability. By connecting people, systems and data in a single view, PagerDuty delivers visibility and actionable intelligence across global operations for effective incident resolution management. PagerDuty has over 100 platform partners, and is trusted by Fortune 500 companies and startups alike, including Microsoft, National Instruments, Electronic Arts, Adobe, Rackspace, Etsy, Square and Github.

@DevOpsSummit Stories
Enterprise architects are increasingly adopting multi-cloud strategies as they seek to utilize existing data center assets, leverage the advantages of cloud computing and avoid cloud vendor lock-in. This requires a globally aware traffic management strategy that can monitor infrastructure health across data centers and end-user experience globally, while responding to control changes and system specification at the speed of today’s DevOps teams. In his session at 20th Cloud Expo, Josh Gray, Chief Architect at Cedexis, covered strategies for orchestrating global traffic achieving the highest-quality end-user experience while spanning multiple clouds and data centers and reacting at the velocity of modern development teams.
In IT, we sometimes coin terms for things before we know exactly what they are and how they’ll be used. The resulting terms may capture a common set of aspirations and goals – as “cloud” did broadly for on-demand, self-service, and flexible computing. But such a term can also lump together diverse and even competing practices, technologies, and priorities to the point where important distinctions are glossed over and lost.
"At the keynote this morning we spoke about the value proposition of Nutanix, of having a DevOps culture and a mindset, and the business outcomes of achieving agility and scale, which everybody here is trying to accomplish," noted Mark Lavi, DevOps Solution Architect at Nutanix, in this SYS-CON.tv interview at @DevOpsSummit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
In his session at 20th Cloud Expo, Mike Johnston, an infrastructure engineer at Supergiant.io, discussed how to use Kubernetes to set up a SaaS infrastructure for your business. Mike Johnston is an infrastructure engineer at Supergiant.io with over 12 years of experience designing, deploying, and maintaining server and workstation infrastructure at all scales. He has experience with brick and mortar data centers as well as cloud providers like Digital Ocean, Amazon Web Services, and Rackspace. His expertise is in automating deployment, management, and problem resolution in these environments, allowing his teams to run large transactional applications with high availability and the speed the consumer demands.
All organizations that did not originate this moment have a pre-existing culture as well as legacy technology and processes that can be more or less amenable to DevOps implementation. That organizational culture is influenced by the personalities and management styles of Executive Management, the wider culture in which the organization is situated, and the personalities of key team members at all levels of the organization. This culture and entrenched interests usually throw a wrench in the works because of misaligned incentives.
SYS-CON Events announced today that Grape Up will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company specializing in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the U.S. and Europe, Grape Up works with a variety of customers from emerging startups to Fortune 1000 companies.
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and controlling infrastructure. The rise of Site Reliability Engineering (SRE) is part of that redefinition of operations vs development roles in organizations.
SYS-CON Events announced today that Massive Networks will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Massive Networks mission is simple. To help your business operate seamlessly with fast, reliable, and secure internet and network solutions. Improve your customer's experience with outstanding connections to your cloud.
When you focus on a journey from up-close, you look at your own technical and cultural history and how you changed it for the benefit of the customer. This was our starting point: too many integration issues, 13 SWP days and very long cycles. It was evident that in this fast-paced industry we could no longer afford this reality. We needed something that would take us beyond reducing the development lifecycles, CI and Agile methodologies. We made a fundamental difference, even changed our culture.
As many know, the first generation of Cloud Management Platform (CMP) solutions were designed for managing virtual infrastructure (IaaS) and traditional applications. But that’s no longer enough to satisfy evolving and complex business requirements. In his session at 21st Cloud Expo, Scott Davis, Embotics CTO, will explore how next-generation CMPs ensure organizations can manage cloud-native and microservice-based application architectures, while also facilitating agile DevOps methodology. He will explain how automation, orchestration and governance are fundamental to managing today’s hybrid cloud environments and are critical for digital businesses to deliver services faster, with better user experience and higher quality, all while saving money.
Most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes a lot of work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reduction in cost and increase in speed.
SYS-CON Events announced today that Datera, that offers a radically new data management architecture, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datera is transforming the traditional datacenter model through modern cloud simplicity. The technology industry is at another major inflection point. The rise of mobile, the Internet of Things, data storage and Big Data are challenging the existing design patterns of the Data Center. The increase in complexity of managing legacy systems alongside new systems is beyond the ability of most IT departments. This leads to multiple tiers of storage and high economic costs, during a time in which IT is expected to do more with less.
“Why didn’t testing catch this” must become “How did this make it to testing?” Traditional quality teams are the crutch and excuse keeping organizations from making the necessary investment in people, process, and technology to accelerate test automation. Just like societies that did not build waterways because the labor to keep carrying the water was so cheap, we have created disincentives to automate. In her session at @DevOpsSummit at 20th Cloud Expo, Anne Hungate, President of Daring Systems, discussed how to break the cycle of manual testing as de-facto – empowering development and quality to lead with automation. Learn about the changes in reporting, behaviors, and communication necessary to make automation the first choice.
SYS-CON Events announced today that GrapeUp, the leading provider of rapid product development at the speed of business, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the USA and Europe, we work with a variety of customers from emerging startups to Fortune 1000 companies.
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business - from apparel to energy - is being rewritten by software. From planning to development to management to security, CA creates software that fuels transformation for companies in the application economy. With CA software at the center of their IT strategy, organizations can leverage the technology that changes the way we live - from the data center to the mobile device. CA's software and solutions help customers thrive in the new application economy by delivering the means to deploy, monitor and secure their applications and infrastructure.
Given the popularity of the containers, further investment in the telco/cable industry is needed to transition existing VM-based solutions to containerized cloud native deployments. The networking architecture of the solution isolates the network traffic into different network planes (e.g., management, control, and media). This naturally makes support for multiple interfaces in container orchestration engines an indispensable requirement.
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades. With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise...
In his opening keynote at 20th Cloud Expo, Michael Maximilien, Research Scientist, Architect, and Engineer at IBM, discussed the full potential of the cloud and social data requires artificial intelligence. By mixing Cloud Foundry and the rich set of Watson services, IBM's Bluemix is the best cloud operating system for enterprises today, providing rapid development and deployment of applications that can take advantage of the rich catalog of Watson services to help drive insights from the vast trove of private and public data available to enterprises.
SYS-CON Events announced today that Elastifile will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Elastifile Cloud File System (ECFS) is software-defined data infrastructure designed for seamless and efficient management of dynamic workloads across heterogeneous environments. Elastifile provides the architecture needed to optimize your hybrid cloud environment, by facilitating efficient data access across cloud and on-premises boundaries - with all the advantages of public IaaS everywhere.
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory?
While some vendors scramble to create and sell you a fancy solution for monitoring your spanking new Amazon Lambdas, hear how you can do it on the cheap using just built-in Java APIs yourself. By exploiting a little-known fact that Lambdas aren’t exactly single-threaded, you can effectively identify hot spots in your serverless code. In his session at @DevOpsSummit at 21st Cloud Expo, Dave Martin, Product owner at CA Technologies, will give a live demonstration and code walkthrough, showing how to overcome the challenges of monitoring S3 and RDS. This presentation will provide an overview of necessary Amazon Lambda concepts and discus how to integrate the monitoring data with other tools.
@DevOpsSummit at Cloud Expo taking place Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center, Santa Clara, CA, is co-located with the 21st International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
SYS-CON Events announced today that Golden Gate University will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Since 1901, non-profit Golden Gate University (GGU) has been helping adults achieve their professional goals by providing high quality, practice-based undergraduate and graduate educational programs in law, taxation, business and related professions. Many of its courses are taught by faculty actively working in their field of expertise, providing students with skills that can be applied immediately. The new MS in Business Analytics, like most of its programs, is available fully online or in-person in downtown SF.
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.