Welcome!

@DevOpsSummit Authors: Yeshim Deniz, Zakia Bouachraoui, Pat Romanski, Liz McMillan, Elizabeth White

Related Topics: @DevOpsSummit, Microservices Expo, Containers Expo Blog

@DevOpsSummit: Blog Feed Post

The Dos and Don’ts of SLA Management | @DevOpsSummit #DevOps #WebPerf #APM #Monitoring

SLAs can be very tricky to manage for a number of different reasons

The Dos and Don'ts of SLA Management
By Craig Lowell

The past few years have seen a huge increase in the amount of critical IT services that companies outsource to SaaS/IaaS/PaaS providers, be it security, storage, monitoring, or operations. Of course, along with any outsourcing to a service provider comes a Service Level Agreement (SLA) to ensure that the vendor is held financially responsible for any lapses in their service which affect the customer’s end users, and ultimately, their bottom line.

SLAs can be very tricky to manage for a number of different reasons: discrepancies over the time period being addressed, the source of the performance metrics, and the accuracy of the data can lead to legal disputes between vendor and customer. However, there are several things that both sides can do to get accurate and verifiable performance data as it pertains to their SLAs.

The first and most critical step is to define the parameters around which the data will be used; this includes the method of data collection (often an agreed-upon neutral third party), and the time and locations from which the performance will be measured. The first part of this is critical. If the vendor and the customer are using different monitoring tools to measure the Service Level Indicators (SLIs), then there will inevitably be disagreements on the validity of the data and whether the Service Level Objective (SLO) was reached or not.

Selecting that vendor depends a great deal on the number of users being served, and where they are located. For a company such as Flashtalking, an ad serving, measuring, and technology company delivering ad impressions throughout the US, Europe, and other international markets, the need for a monitoring tool which can accurately measure the performance and user experience in many different areas around the world is critical to their SLA management efforts.

Flashtalking agrees upon the external monitoring tool with every one of their clients as part of their SLAs, using Catchpoint as the unbiased third party due to the number of monitoring locations and the accuracy of the data. Their customers obviously want the most accurate view of the customer experience and the impressions garnered, so monitoring from as close to the end user as possible is the best way to achieve that. In that sense, the more locations from which to test the product, the more accurate the data from an end user’s perspective.

Those measurement locations should include backbone and last mile, as well as any cloud provider from which the ads are being served. This diversity of locations ensures that they will still have visibility and reporting capabilities should the cloud provider itself experience an outage; the backbone tests eliminate noise and are therefore the cleanest for validating the SLO, and the last mile tests best replicate the end user experience./p>

Once the SLA and its parameters are agreed upon by both sides, each one of Flashtalking’s products is then set up with a single test that captures the performance of their clients’ ads through every stage of the IT architecture, whether it’s a single site, single server, or encompasses multiple databases/networks/etc.

Of course, establishing criteria and setting up the tests is only part of the SLA management battle. For a cloud provider to stay on top of its SLAs, they must also be able to rely on the alerting features to notify them if they are in danger of being in breach, as well as the accuracy and depth of the reporting to assist with identifying the root cause of the issue. In many cases, an ad serving company such as Flashtalking is relying on other third parties such as DNS resolvers, cloud providers, and content delivery networks to deliver the ads to the end users, which means that a disruption in service is not necessarily their fault. Still, they must be able to share their performance data with their own vendors in order to resolve the issue as quickly as possible for their own customers. In cases such as these, they must be able to easily separate their first- and third-party architecture components to show when a service disruption is not their fault and hold their own vendors accountable instead.

To learn more about SLA management and how both customers and vendors can ensure continuous service delivery, check out our SLA handbook.

The post The Dos and Don’ts of SLA Management appeared first on Catchpoint's Blog - Web Performance Monitoring.

Read the original blog entry...

More Stories By Mehdi Daoudi

Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.

Founded in 2008 by four DoubleClick / Google executives with a passion for speed, reliability and overall better online experiences, Catchpoint has now become the most innovative provider of web performance testing and monitoring solutions. We are a team with expertise in designing, building, operating, scaling and monitoring highly transactional Internet services used by thousands of companies and impacting the experience of millions of users. Catchpoint is funded by top-tier venture capital firm, Battery Ventures, which has invested in category leaders such as Akamai, Omniture (Adobe Systems), Optimizely, Tealium, BazaarVoice, Marketo and many more.

@DevOpsSummit Stories
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Addteq is a leader in providing business solutions to Enterprise clients. Addteq has been in the business for more than 10 years. Through the use of DevOps automation, Addteq strives on creating innovative solutions to solve business processes. Clients depend on Addteq to modernize the software delivery process by providing Atlassian solutions, create custom add-ons, conduct training, offer hosting, perform DevOps services, and provide overall support services.
Contino is a global technical consultancy that helps highly-regulated enterprises transform faster, modernizing their way of working through DevOps and cloud computing. They focus on building capability and assisting our clients to in-source strategic technology capability so they get to market quickly and build their own innovation engine.
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addresses many of the challenges faced by developers and operators as monolithic applications transition towards a distributed microservice architecture. A tracing tool like Jaeger analyzes what's happening as a transaction moves through a distributed system. Monitoring software like Prometheus captures time-series events for real-time alerting and other uses. Grafeas and Kritis provide security polic...
DevOpsSUMMIT at CloudEXPO will expand the DevOps community, enable a wide sharing of knowledge, and educate delegates and technology providers alike. Recent research has shown that DevOps dramatically reduces development time, the amount of enterprise IT professionals put out fires, and support time generally. Time spent on infrastructure development is significantly increased, and DevOps practitioners report more software releases and higher quality. Sponsors of DevOpsSUMMIT at CloudEXPO will benefit from unmatched branding, profile building and lead generation opportunities.