Click here to close now.




















Welcome!

@DevOpsSummit Authors: Yeshim Deniz, Liz McMillan, Carmen Gonzalez, Samuel Scott, Pat Romanski

Blog Feed Post

Black Friday Horror Story Averted with Alerting and Monitoring

image_pdfimage_print

AppDynamics recently announced the launch of our Application Intelligence Platform, which is the underlying infrastructure that delivers our portfolio of products to customers. A key component of the Application Intelligence platform is the notion of extensibility – we can integrate with many of the existing tools you already have in place so you can leverage AppDynamics analytics in the tools and dashboards your team knows and loves with minimal effort. These extensions as we call them are available on the AppDynamics eXchange section of our Community for download, and customers even have the option to submit extensions they’ve written themselves to be included in the eXchange.

To illustrate the power of the 75+ extensions we’ve published in our community, I’ll walk you through two scenarios that involve several common technologies that are prevalent across our customer base.

____

Before AppDynamics:

Jerry has been tossing and turning all night long. In fact, he’s had difficulty sleeping the past three weeks. His sub-optimal sleep patterns are in large part a result of the production application environment he is responsible for. “Things always seem to break in the middle of the night,” Jerry complained to his wife earlier that day.

As the DevOps lead for their company’s mission critical ecom app, Jerry is copied on most urgent application related alerts so that he can help manually forward the details he gets from his current monitoring tools to the admin from his team who happens to be on call at the time. Tonight, he only received 5 such notifications which is less than normal, but still sufficient to wake him up throughout the night. As he squints in the darkness and his eyes adjust to the bright screen, he sees a new notification that troubles him… “SSL Certificate Expired?” he mumbles to himself. “How is that possible?”

He checks the clock – 5:30AM. The person who handles the SSL Certificate isn’t going to be awake for a few more hours. Jerry’s heart drops because he knows that for every hour his ecommerce application is down it costs his company about $10,000 of revenue. “Why wasn’t this on my radar?” Jerry says. “We could’ve planned for this.”

Jerry gets to work early and starts sending emails and calling stakeholders to schedule an 8:30AM conference call. By 9:15AM the action items and deliverables are clear. By 10:30AM the SSL Certificate is renewed and the ecom store is back online servicing customers. Whew. “That could’ve been a lot worse than just 5 hours of downtime and $50K of revenue impact,” Jerry reasons with a colleague.

Back at his desk, Jerry looks at his calendar, his next meeting is ‘testing & capacity planning’ which is a weekly recurring meeting with him and his team.

Jerry’s company is preparing for the holiday season (Black Friday, Cyber Monday, etc.) which is still a few months away but for ecommerce stores, these peak seasons are huge operational and business challenges. You know that $10K per hour of revenue metric?  During those peak days in the holiday season that quadruples to $40K of revenue per hour. The ecommerce store can’t have any hiccups during that time or the impact would be massive, and that’s why this particular recurring meeting leading up to the code freeze are very important.

Jerry greets his team and looks over the shoulder of one of his sys admins. She’s just got the application infrastructure diagram drawn on the white board and has the first load test done and now they are analyzing the results. Looks like most of the synthetic tests they’ve run completed with relatively few errors and utilization was within the acceptable range even as the load increased over the duration of the load test. So far so good.

Jerry moves on to peek over the shoulder of his DBA who is currently analyzing the Cassandra cluster metrics after the load test. Disk I/O looked good and memory looked OK. Over the course of the next hour Jerry’s team tests 6 different load testing and failover scenarios. Today’s tests are done – until next week.

“Everything looks good… a little too good,” Jerry says to himself. “My team and I understand things like utilization and throughput but how does that translate to things my boss and the rest of the business care about?”

If only there was another approach to monitoring that would save Jerry from the fire drills, cut down on the constant testing and debugging, and give him a real-time view into how customers were engaging with his ecommerce application…

Luckily for Jerry, AppDynamics does just that…  Let’s look at this same situation one year later.

After AppDynamics:

Jerry wakes up from a great night’s sleep and checks his email for the daily AppDynamics events digest that gets sent to him with all of the application events over the last 24 hours. Only one event in the digest. Ever since Jerry’s organization invested into AppDynamics’ products that are delivered on the Application Intelligence Platform, his dev team has gotten code-level visibility into the root cause of performance issues inside his ecommerce application and has substantially cut down the number of bugs in the software. That means less production issues for his team to deal with downstream.

Using the PagerDuty alerting extension, the one issue that was sent in Jerry’s digest triggered the creation of a help ticket and was automatically assigned to the technician on duty with no manual intervention on Jerry’s part.

By the time Jerry checked on the status, the issue was already resolved. Nice.

On his way to work, Jerry smiles and thinks about last year’s SSL Certificate debacle. Since installing the SSL Certificate Monitoring extension from AppDynamics, his team has been able to build a dashboard that shows the number of days left until the SSL Certificate expires. No more SSL Certs expiring without anyone knowing ahead of time.

Jerry arrives at work and goes to his recurring ‘testing and capacity planning’ meeting that his team sets up every year around this time. Since deploying AppDynamics and installing two additional AppDynamics extensions – the Cassandra monitoring extension and the Amazon Web Services (AWS) cloud connector extension – his testing and capacity planning work for the holiday season has gotten a lot easier.

First, AppDynamics has given him and his team a great topology view that has relieved them of their needs for Visio diagrams and whiteboarded architectures. Being able to have a real-time view of how the different components of an application interact with each other, and have that map update automatically as new code is released, was hugely valuable for Jerry’s team.

Screen Shot 2014-06-18 at 10.26.02 AM

Second, during Cassandra testing, in addition to getting basic metrics like disk I/O and memory, the Cassandra extension provides configurable metrics like:

  • Cache size, capacity, hit count, hit rate, request count

  • Total latency, statistics, timeout requests, unavailable requests

  • Bloom filter disk space used, false positives, false ratio

  • SSTables compression ratio, live tables, disk space, compacted row size

  • Row size histogram

  • Column count histogram

  • Memtable columns, data size, switch count

  • Pending tasks

  • Read latency

  • Write latency

  • Pending and completed tasks

  • Compaction tasks pending and completed

  • Timeouts

  • Dropped messages

  • Streams

  • Total disk space used

  • Thread pool tasks: active, completed, blocked, pending

By leveraging these metrics, Jerry’s team is able to get granular visibility into Cassandra performance and see exactly where performance bottlenecks occur. This visibility has cut down the time needed to test their Cassandra implementation drastically. Pinpointing exactly where the performance issues are and what caused them enable Jerry’s team to proactively address Cassandra performance issues before they affect end users.

Finally, while capacity planning, Jerry now leverages the Amazon Web Services (AWS) cloud connector extension which allows his team to easily scale up and scale down in the cloud automatically based on policies that can involve a number of rules including:

•       Overall application health (load, response time, number of slow calls, etc.)

•       Business transaction health (load, response time, number of slow calls, etc.)

•       End User Experience health (pages / iFrames / AJAX requests per minute, first byte time, DOM ready time, etc.)

•       Databases & Remote Services health (calls per minute, errors per minute, etc)

•       Error rates (exceptions, return codes, etc.)

This year, Jerry’s team is putting a few different health rules in place that will automatically scale up the AWS EC2 resources when certain load & response time metrics are breached and scale down when those metrics go back down to a normal level. Jerry has also added an authorization step to these workflows that will alert him and ask for permission before spinning instances up or down. That way, they only pay for the EC2 resources they need to use and Jerry still has full control.

Screen Shot 2014-06-12 at 3.49.26 PMScreen Shot 2014-06-12 at 3.49.51 PM

Screen Shot 2014-06-12 at 3.50.17 PM

Jerry leaves the testing meeting with full confidence that his team has a good grasp on the upcoming peak season and has the visibility in place that will allow his team to quickly deal with any performance issues as they arise.

_____

As you can see, Jerry is in a lot better spot this year than he was 1 year ago. By leveraging AppDynamics he has one platform that can easily connect to the rest of the technologies he already uses and provide him a single UI in which he can manage the performance of his environment.

If you’d like to try AppDynamics for free and test drive some of the extensions we’ve highlighted in this blog post, click here.

The post Black Friday Horror Story Averted with Alerting and Monitoring written by appeared first on Application Performance Monitoring Blog from AppDynamics.

Read the original blog entry...

More Stories By AppDynamics Blog

In high-production environments where release cycles are measured in hours or minutes — not days or weeks — there's little room for mistakes and no room for confusion. Everyone has to understand what's happening, in real time, and have the means to do whatever is necessary to keep applications up and running optimally.

DevOps is a high-stakes world, but done well, it delivers the agility and performance to significantly impact business competitiveness.

@DevOpsSummit Stories
ElasticBox, the agile application delivery manager, announced freely available public boxes for the DevOps community. ElasticBox works with enterprises to help them deploy any application to any cloud. Public boxes are curated reference boxes that represent some of the most popular applications and tools for orchestrating deployments at scale. Boxes are an adaptive way to represent reusable infrastructure as components of code. Boxes contain scripts, variables, and metadata to automate processes for deploying applications to any cloud infrastructure. Stitched together, boxes model complex p...
Puppet Labs is pleased to share the findings from our 2015 State of DevOps Survey. We have deepened our understanding of how DevOps enables IT performance and organizational performance, based on responses from more than 20,000 technical professionals we’ve surveyed over the past four years. The 2015 State of DevOps Report reveals high-performing IT organizations deploy 30x more frequently with 200x shorter lead times. They have 60x fewer failures and recover 168x faster
To support developers and operations professionals in their push to implement DevOps principles for their infrastructure environments, ProfitBricks, a provider of cloud infrastructure, is adding support for DevOps tools Ansible and Chef. Ansible is a platform for configuring and managing data center infrastructure that combines multi-node software deployment, ad hoc task execution, and configuration management, and is used by DevOps professionals as they use its playbooks functionality to automate cloud infrastructure. Chef, which is written in the Ruby programming language, makes it easier t...
Containers are not new, but renewed commitments to performance, flexibility, and agility have propelled them to the top of the agenda today. By working without the need for virtualization and its overhead, containers are seen as the perfect way to deploy apps and services across multiple clouds. Containers can handle anything from file types to operating systems and services, including microservices. What are microservices? Unlike what the name implies, microservices are not necessarily small, but are focused on specific tasks. The ability for developers to deploy multiple containers – thous...
Skeuomorphism usually means retaining existing design cues in something new that doesn’t actually need them. However, the concept of skeuomorphism can be thought of as relating more broadly to applying existing patterns to new technologies that, in fact, cry out for new approaches. In his session at DevOps Summit, Gordon Haff, Senior Cloud Strategy Marketing and Evangelism Manager at Red Hat, discussed why containers should be paired with new architectural practices such as microservices rather than mimicking legacy server virtualization workflows and architectures.
It’s been proven time and time again that in tech, diversity drives greater innovation, better team productivity and greater profits and market share. So what can we do in our DevOps teams to embrace diversity and help transform the culture of development and operations into a true “DevOps” team? In her session at DevOps Summit, Stefana Muller, Director, Product Management – Continuous Delivery at CA Technologies, answered that question citing examples, showing how to create opportunities for diverse candidates and taking feedback from the audience on their experiences with encouraging diver...
SYS-CON Events announced today that G2G3 will exhibit at SYS-CON's @DevOpsSummit Silicon Valley, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Based on a collective appreciation for user experience, design, and technology, G2G3 is uniquely qualified and motivated to redefine how organizations and people engage in an increasingly digital world.
Whether you like it or not, DevOps is on track for a remarkable alliance with security. The SEC didn’t approve the merger. And your boss hasn’t heard anything about it. Yet, this unruly triumvirate will soon dominate and deliver DevSecOps faster, cheaper, better, and on an unprecedented scale. In his session at DevOps Summit, Frank Bunger, VP of Customer Success at ScriptRock, will discuss how this cathartic moment will propel the DevOps movement from such stuff as dreams are made on to a practical, powerful, and insanely valuable asset to enterprises. You may call it DevSecOps, or SecDevOps...
SYS-CON Events announced today that DataClear Inc. will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. The DataClear ‘BlackBox’ is the only solution that moves your PC, browsing and data out of the United States and away from prying (and spying) eyes. Its solution automatically builds you a clean, on-demand, virus free, new virtual cloud based PC outside of the United States, and wipes it clean, destroying it completely when you log out. If you wish to store your data, the solution will inclu...
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on demos and comprehensive walkthroughs.
Puppet Labs has announced the next major update to its flagship product: Puppet Enterprise 2015.2. This release includes new features providing DevOps teams with clarity, simplicity and additional management capabilities, including an all-new user interface, an interactive graph for visualizing infrastructure code, a new unified agent and broader infrastructure support.
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevOps to advance innovation and increase agility. Specializing in designing, imple...
In today's digital world, change is the one constant. Disruptive innovations like cloud, mobility, social media, and the Internet of Things have reshaped the market and set new standards in customer expectations. To remain competitive, businesses must tap the potential of emerging technologies and markets through the rapid release of new products and services. However, the rigid and siloed structures of traditional IT platforms and processes are slowing them down – resulting in lengthy delivery cycles and a poor customer experience.
Advances in technology and ubiquitous connectivity have made the utilization of a dispersed workforce more common. Whether that remote team is located across the street or country, management styles/ approaches will have to be adjusted to accommodate this new dynamic. In his session at 17th Cloud Expo, Sagi Brody, Chief Technology Officer at Webair Internet Development Inc., will focus on the challenges of managing remote teams, providing real-world examples that demonstrate what works and what doesn’t. It will cover proper training and integration of these teams into the corporate structure,...
Any Ops team trying to support a company in today’s cloud-connected world knows that a new way of thinking is required – one just as dramatic than the shift from Ops to DevOps. The diversity of modern operations requires teams to focus their impact on breadth vs. depth. In his session at DevOps Summit, Adam Serediuk, Director of Operations at xMatters, Inc., will discuss the strategic requirements of evolving from Ops to DevOps, and why modern Operations has begun leveraging the “NoOps” approach. NoOps enables developers to deploy, manage, and scale their own code, creating an infrastructure...
SYS-CON Events announced today that the "Second Containers & Microservices Expo" will take place November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
Everyone talks about continuous integration and continuous delivery but those are just two ends of the pipeline. In the middle of DevOps is continuous testing (CT), and many organizations are struggling to implement continuous testing effectively. After all, without continuous testing there is no delivery. And Lab-As-A-Service (LaaS) enhances the CT with dynamic on-demand self-serve test topologies. CT together with LAAS make a powerful combination that perfectly serves complex software development and delivery pipelines. Software Defined Networks (SDNs) turns the network into a flexible confi...
Mobile, social, Big Data, and cloud have fundamentally changed the way we live. “Anytime, anywhere” access to data and information is no longer a luxury; it’s a requirement, in both our personal and professional lives. For IT organizations, this means pressure has never been greater to deliver meaningful services to the business and customers.
DevOps Summit, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essent...
eCube Systems has released NXTmonitor, a full featured application orchestration solution. NXTmonitor, which inherited the code base of NXTminder, has been extended to support multi-discipline processes and will act as a DevOps utility in a heterogeneous enterprise environment. Previously, NXTminder was packaged with NXTera middleware to configure and manage Entera and NXTera RPC servers. “Since we are widening the focus of this solution to DevOps, we felt the need to change the name to NXTmonitor to accurately reflect the operations monitoring features it provides,” says Kevin Barnes, Presi...
Akana has announced the availability of the new Akana Healthcare Solution. The API-driven solution helps healthcare organizations accelerate their transition to being secure, digitally interoperable businesses. It leverages the Health Level Seven International Fast Healthcare Interoperability Resources (HL7 FHIR) standard to enable broader business use of medical data. Akana developed the Healthcare Solution in response to healthcare businesses that want to increase electronic, multi-device access to health records while reducing operating costs and complying with government regulations.
Enterprises can achieve rigorous IT security as well as improved DevOps practices and Cloud economics by taking a new, cloud-native approach to application delivery. Because the attack surface for cloud applications is dramatically different than for highly controlled data centers, a disciplined and multi-layered approach that spans all of your processes, staff, vendors and technologies is required. This may sound expensive and time consuming to achieve as you plan how to move selected applications to the cloud, but smart organizations are actually reporting an improved security posture, accel...
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
The 5th International DevOps Summit, co-located with 17th International Cloud Expo – being held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits, DevOps is corr...