@DevOpsSummit Authors: Liz McMillan, Yeshim Deniz, Zakia Bouachraoui, Pat Romanski, Elizabeth White

Related Topics: @DevOpsSummit, Java IoT, Linux Containers

@DevOpsSummit: Blog Post

"New DevOps Plug-in" By @TrevParsons | @DevOpsSummit [#DevOps]

Logs are the most fine-grained data source for understanding today's system

As co-founder of Logentries I am often asked - "Why Logs?"  And I have to admit, upon first impression, ‘log management and analytics' does not seem like the sexiest space :). However at Logentries we are here to redefine that space, to provide a solution to access, manage and understand your log data that is easy to use, cost effective and intelligent (i.e., it does the hard work so you don't have to).  But that being said it still begs the question, "Why logs?"

Logs are the most fine-grained data source for understanding today's system. Unlike traditional monitoring and analytics tools which provide an aggregate view of what is happening in your system (such as server monitoring, application performance monitoring, web analytics etc.), logs capture every single event so that you can understand not only the general trends, but EXACTLY what happened, in what order, and by whom. Logs allow you to view this level of detail in real-time or to review it in a post-mortem fashion. At the same time, they can be rolled up into dashboards to give you a high level view of what is happening across your system. So in effect they can provide the best of both worlds: the low level detail of exactly what has happened as well as the high level trends across your systems.

However the biggest issue with many logging solutions today  is:

  • Too expensive: Keeping all that log data around for more that 30 days has been prohibitively expensive, so deep historical system understanding has been difficult to achieve with logs. People have instead turned to the traditional monitoring tools that give summary views that can span back indefinitely due to the ability to store this data in a much more cost effective manner vs. raw logs
  • Too difficult to use: Logging providers expect you to learn their query language, requiring deep technical skills and a lot of time on your hands to get value from them.
  • Too difficult to maintain: In particular open source or in house solutions are  difficult and costly to maintain and organizations quickly get frustrated with their in house logging solution.

At Logentries we address (and continue to address) each of the above points. We want you to send us all your data, and to make this available  in an easy-to-use, accessible and cost effective manner.

And sending us all of your data has just become even easier with our new Shinkin/Nagios and Diamond integrations:

  • Nagios Plug-in via Shinken: Shinken is an open source monitoring framework, that is compatible with your Nagios plugins, but improves some of the traditional issues with the Nagios framework (e.g. scalability). The Logentries NagiosLogentries Nagios Plug-in
  • Shinken module allows you to send results of your Nagios or Shinken health checks to Logentries such that you can get a real time view of the health of your infrastructure, correlated with your traditional log data. You can also easily maintain a history of your health checks which has always been difficult with tools like Nagios, so it's easier now to look back historically at any major issues and to identify and recurring themes.
  • Diamond: Diamond is a python daemon for collecting metrics. It also has a bunch of collectors that provide the ability to collect detailed performance metrics from your OS as well as from common components like Hadoop, Mongo, Kafka, MySQL, NetApp, RabbitMQ, Redis, AWS S3... The new Logentries Diamond handler allows you to stream all of these metrics into your Logentries account in real time so you can easily visualize them in dashboards and again correlate with any traditional logs from your systems or apps.

Check out these new IT and Dev Ops plug-in designed to continue to provide the deepest, most fine-grained view of your system-wide operational data.

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

@DevOpsSummit Stories
With more than 30 Kubernetes solutions in the marketplace, it's tempting to think Kubernetes and the vendor ecosystem has solved the problem of operationalizing containers at scale or of automatically managing the elasticity of the underlying infrastructure that these solutions need to be truly scalable. Far from it. There are at least six major pain points that companies experience when they try to deploy and run Kubernetes in their complex environments. In this presentation, the speaker will detail these pain points and explain how cloud can address them.
While DevOps most critically and famously fosters collaboration, communication, and integration through cultural change, culture is more of an output than an input. In order to actively drive cultural evolution, organizations must make substantial organizational and process changes, and adopt new technologies, to encourage a DevOps culture. Moderated by Andi Mann, panelists discussed how to balance these three pillars of DevOps, where to focus attention (and resources), where organizations might slip up with the wrong focus, how to manage change and risk in all three areas, what is possible and what is not, where to start, and especially how new structures, processes, and technologies can help drive a new DevOps culture.
When building large, cloud-based applications that operate at a high scale, it's important to maintain a high availability and resilience to failures. In order to do that, you must be tolerant of failures, even in light of failures in other areas of your application. "Fly two mistakes high" is an old adage in the radio control airplane hobby. It means, fly high enough so that if you make a mistake, you can continue flying with room to still make mistakes. In his session at 18th Cloud Expo, Lee Atchison, Principal Cloud Architect and Advocate at New Relic, discussed how this same philosophy can be applied to highly scaled applications, and can dramatically increase your resilience to failure.
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regulatory scrutiny and increasing consumer lack of trust in technology in general.