Welcome!

@DevOpsSummit Authors: Yeshim Deniz, Elizabeth White, Pat Romanski, Liz McMillan, Aruna Ravichandran

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog

@DevOpsSummit: Blog Feed Post

A Coordinated Response Culture for Incident Management | @DevOpsSummit #APM #DevOps

Organizational culture that prioritizes coordinated response to incidents is vital for monitoring & managing IT infrastructure

A Coordinated Response Culture for Incident Management
By Christopher Tozzi

An organizational culture that prioritizes coordinated response to incidents is vital for monitoring and managing an IT infrastructure. Incident management won’t go smoothly if teams don’t want to or know how to coordinate their response to alerts.

What is a Coordinated Response?
To break it down simply, a coordinated response in the IT world generally involves notifying the right people and mobilizing teams immediately, providing access to contextual information for seamless alignment, and getting the team onto the right conference bridge or communication channel of choice. A well coordinated response enables organizations to jump in and resolve incidents quickly and efficiently.

Let’s take a look at some of the challenges to response coordination, then examine strategies and tools that allow organizations to overcome these challenges and optimize incident management.

Coordination Challenges
A coordinated response culture rarely breeds itself within an organization. By default, there are obstacles in place that make coordinated response difficult. The biggest challenges include:

  • RECRUITMENT: The difficulty of recruiting additional people to help manage an incident.This challenge arises when you need to bring others in to help put out a fire, but the additional people you need didn’t receive the original alert. Unless you have a incident management platform like PagerDuty, there is no efficient way to request help from others when an incident occurs. You could email or call, of course, and hope the people respond quickly, but emails and calls are not always the fastest way to get in touch or the best way to get someone’s attention quickly, especially outside of normal business hours
  • COMMUNICATION: Too many communication channels give too many options. Multiple communication channels are available for incident management, from email to video chats to Slack. Depending on the type of incident at hand, one channel may make more sense than another. What you don’t want to do is waste time in the midst of an incident figuring out which channel to use and making sure all your team members are on it.
  • TOOLS: Most organizations have some mode of coordination and collaboration. To create an effective coordination culture, it’s best to work with the tools already on hand, and integrating them with a central incident management platform.

Increasing Coordination
Fortunately, these challenges can be solved easily enough by taking advantage of the features very recently released in PagerDuty. With Response Mobilizer and Response Bridge organizations can:

  • RECRUIT: Recruit additional teammates to help solve a particular incident. The best way to do this is to build the recruitment of additional help into your incident management workflow. That’s better than relying on manual, ad-hoc ways of asking for help in the midst of an incident, and you’ll know you have the experts you need there to help you.
  • COMMUNICATE: Integrate existing and most preferred communication tools with your incident management workflow; whether it be email, SMS, Slack, existing conference bridge from WEBEX, GoToMeeting or Skype. Trying to impose a new communication tool on the team will often disrupt the established workflow and incur resistance from staff. The better approach is to integrate your existing communication tools into your incident management solution.
  • HAVE CONTEXT: Include right contextual information to address business-critical issues in real-time. Providing rich contextual information about the incident and including a brief message to responders detailing why they are needed enables responders to prepare and immediately be aligned.

If you’re seeking to build a better-coordinated response culture for incident management within your organization, take advantage of the new features to make the most of the collaboration and coordination resources available to you, without forcing your team to overhaul its communication workflow.

The post A Coordinated Response Culture for Incident Management appeared first on PagerDuty.

Read the original blog entry...

More Stories By PagerDuty Blog

PagerDuty’s operations performance platform helps companies increase reliability. By connecting people, systems and data in a single view, PagerDuty delivers visibility and actionable intelligence across global operations for effective incident resolution management. PagerDuty has over 100 platform partners, and is trusted by Fortune 500 companies and startups alike, including Microsoft, National Instruments, Electronic Arts, Adobe, Rackspace, Etsy, Square and Github.

@DevOpsSummit Stories
We all know that end users experience the Internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices – not doing so will be a path to eventual business failure.
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp empowers global organizations to unleash the full potential of their data to expand customer touchpoints, foster greater innovation and optimize their operations.
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere delivers a more modern architectural approach to storage that doesn't require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbuilding of data centers to house increasing amounts of storage infrastructure.
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making real-time decisions based on a combination of real user monitoring, synthetic testing, APM, NGINX / local load balancers, and other data sources, is critical.
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, will discuss how given the magnitude of today's application ecosystem, tweaking existing software to stitch various components together leads to sub-optimal solutions. This definitely deserves a re-think, and paves the way for a new breed of lightweight application servers that are micro-services and DevOps ready!
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. That means serverless is also changing the way we leverage public clouds. Truth-be-told, many enterprise IT shops were so happy to get out of the management of physical servers within a data center that many limitations of the existing public IaaS clouds were forgiven. However, now that we’ve lived a few years with public IaaS clouds, developers and CloudOps pros are giving a huge thumbs down to the ...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering management. To date, IBM has launched more than 50 cloud data centers that span the globe. He has been building advanced technology, delivering “as a service” solutions, and managing infrastructure services for the past 20 years.
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. This leads to a waste of cloud resources and increased operational overhead.
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TAZMO technology and development capabilities in the semiconductor and LCD-related manufacturing fields are among the best worldwide. For more information, visit https://www.tazmo.co.jp/en/.
SYS-CON Events announced today that Avere Systems, a leading provider of hybrid cloud enablement solutions, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere Systems was created by file systems experts determined to reinvent storage by changing the way enterprises thought about and bought storage resources. With decades of experience behind the company’s founders, Avere got its start in 2008 with a mission to use fast, flash-based storage in the most efficient, effective manner possible. What the team had discovered was a technology that optimized storage resources and reduced dependencies on sprawling storage installations. Launched as the Avere OS, this advanced file system not only boosted performance within standard, on-premises, network-attached storage systems but ...
SYS-CON Events announced today that TidalScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TidalScale is the leading provider of Software-Defined Servers that bring flexibility to modern data centers by right-sizing servers on the fly to fit any data set or workload. TidalScale’s award-winning inverse hypervisor technology combines multiple commodity servers (including their associated CPUs, memory storage and network) into one or more large servers capable of handling the biggest Big Data problems and most unpredictable workloads.
Microsoft Azure Container Services can be used for container deployment in a variety of ways including support for Orchestrators like Kubernetes, Docker Swarm and Mesos. However, the abstraction for app development that support application self-healing, scaling and so on may not be at the right level. Helm and Draft makes this a lot easier. In this primarily demo-driven session at @DevOpsSummit at 21st Cloud Expo, Raghavan "Rags" Srinivas, a Cloud Solutions Architect/Evangelist at Microsoft, will cover Docker Swarm and Kubernetes deployments on Azure with some simple examples. He will look at Helm and Draft and how they can simplify app development significantly, like app scaling, rollback, etc. Helm is a tool that streamlines installing and managing Kubernetes applications, like the apt/yum/homebrew for Kubernetes. Draft works with pre-provided charts to deploy the apps via Helm.
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous architectural and coordination work to minimize the volatility of the cloud environment and leverage the security features of the cloud to the benefit of the CICD pipeline.
Containers are rapidly finding their way into enterprise data centers, but change is difficult. How do enterprises transform their architecture with technologies like containers without losing the reliable components of their current solutions? In his session at @DevOpsSummit at 21st Cloud Expo, Tony Campbell, Director, Educational Services at CoreOS, will explore the challenges organizations are facing today as they move to containers and go over how Kubernetes applications can deploy with legacy components, and also go over automated capabilities provided by operators to auto-update Kubernetes with zero downtime for current and secure deployments.
Today most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes significant work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reduction in cost and increase in speed. Sometimes in order to reduce complexity teams compromise features or change requirements
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http://www.ryobi-sol.co.jp/en/.
As you move to the cloud, your network should be efficient, secure, and easy to manage. An enterprise adopting a hybrid or public cloud needs systems and tools that provide: Agility: ability to deliver applications and services faster, even in complex hybrid environments Easier manageability: enable reliable connectivity with complete oversight as the data center network evolves Greater efficiency: eliminate wasted effort while reducing errors and optimize asset utilization Security: implement always-vigilant DNS security
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, will discuss how by using new techniques such as feature flagging, rollouts, and traffic splitting, experimentation is no longer just the future for marketing teams, it’s quickly becoming an essential practice for high-performing development teams as well.
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
SYS-CON Events announced today that Daiya Industry will exhibit at the Japanese Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ruby Development Inc. builds new services in short period of time and provides a continuous support of those services based on Ruby on Rails. For more information, please visit https://github.com/RubyDevInc.
When it comes to cloud computing, the ability to turn massive amounts of compute cores on and off on demand sounds attractive to IT staff, who need to manage peaks and valleys in user activity. With cloud bursting, the majority of the data can stay on premises while tapping into compute from public cloud providers, reducing risk and minimizing need to move large files. In his session at 18th Cloud Expo, Scott Jeschonek, Director of Product Management at Avere Systems, discussed the IT and business benefits that cloud bursting provides, including increased compute capacity, lower IT investment, financial agility, and, ultimately, faster time-to-market.
Is advanced scheduling in Kubernetes achievable? Yes, however, how do you properly accommodate every real-life scenario that a Kubernetes user might encounter? How do you leverage advanced scheduling techniques to shape and describe each scenario in easy-to-use rules and configurations? In his session at @DevOpsSummit at 21st Cloud Expo, Oleg Chunikhin, CTO at Kublr, will answer these questions and demonstrate techniques for implementing advanced scheduling. For example, using spot instances and cost-effective resources on AWS, coupled with the ability to deliver a minimum set of functionalities that cover the majority of needs – without configuration complexity.