Welcome!

@DevOpsSummit Authors: Liz McMillan, Pat Romanski, Zakia Bouachraoui, Elizabeth White, Dana Gardner

Related Topics: @DevOpsSummit, @CloudExpo, SDN Journal

@DevOpsSummit: Blog Post

Log Analysis for Software-Defined Data Centers | @DevOps Summit [#DevOps]

Log data provides the most granular view into what is happening across your systems, applications, and end users

by Chris Riley

Modern infrastructure constantly generates log data at a rate faster than humans can easily analyze. And now that data centers can be built and torn down with scripts, the amount of activity and data is exponential.

The traditional log analysis practices of manually reviewing log files on a weekly or daily basis, remain inadequate when looking at Software-defined Data Centers (SDDC). The modern architecture of SDDC, with its highly automated and dynamic deployment capabilities for multi-tier applications, necessitates real-time log analytics. Analytics that are key to complex troubleshooting, dynamic provisioning, high performance and superior security.log-analysis-for-software-defined-data-centers-2

In the software defined data center you are looking at many more variables beyond servers. You want to see provisioning volume and time. You want to know performance and iOPS of bare metal machines. You want to know about how the data centers network, and all individual virtual networks are performing, their security, and possible weak spots. And in the case of companies like IaaS and hosting providers you might be managing many of these virtual data centers all at one time.

Identifying root-cause performance bottlenecks, security vulnerabilities and optimizing provisioning of SDDC resources is only possible with a comprehensive log management solution. One that takes log data from individual components and presents a consolidated view of the infrastructure's system log data. The resulting operational intelligence enables deep, enterprise-wide visibility to ensure optimized utilization of SDDC resources, and advanced alerting to call details to pertinent and urgent issues.

Without these capabilities, IT administrators have to rely exclusively on system metrics, limiting their ability to make comprehensive decisions on performance alone, and possibly only performance at the data center level. Things such as memory consumption, CPU utilization and storage overlook valuable diagnostic information stored in log files.

Here are some of the categories of information that log analysis in SDDC can provide.

  • Machine Provisioning, De-Provisioning, and Moves: In the modern datacenter VMs move from physical machine to physical machine sometimes even while running, with technologies like v-motion. In order to optimize the processes for moving VMs to accommodate load historical reporting on VM moves, provisioning, and de-provisioning can help teams understand where to optimize the processes or and and remove bare metal machines.
  • Data Enter to Bare Metal Utilization: Enable the advantage of cloud technologies such as elasticity, on-demand availability and flexibility with the performance, consistency and predictability of bare metal servers. Log analysis allows IT decision makers to incorporate accurate information of machine efficiencies in planning for the overall provisioning, scaling and utilization of SDDC environments.
  • Intrusion Monitoring and Management: Log data can be used toidentify anomalous activities and creating automated alerts to point out areas of concern in real-time. With traditional, manual log analysis practices, IT administrators fail to extract insights from log data pointing to possible performance and security issues. A log analysis based management solution automates these processes, frees IT administrators from tedious manual log analysis tasks and provides enhanced visibility into infrastructure operations to prevent data breaches.
  • Audit Trails for Forensics Analysis and Compliance: Correlate log data to trace suspected intrusions or data loss, and maintain compliance to strict security regulations.
  • Incident Containment: Identify and isolate compromised or underperforming components to prevent infrastructure-wide damages with real-time alert configurations. Users can also analyze log data to identify causal links between independent outages and performance issues, spotting them before they grow.
  • Infrastructure Optimization: Active network log management allows IT decision makers to shape the infrastructure to meet diverse and evolving business demands. DevOps can also use log data in integrated test environments to correlate tests results with log data generated by SDDC infrastructure and applications.
  • Reduced Cost: Fewer tools and IT expertise are required to maintain and manage complex SDDC infrastructure.

And the implementation is easy. Just like server monitoring log pulls from server data are as easy as installing an agent. In the case of the SDDC the agent must already be a part of the script or gold master VM used for all provisioning. But in addition to the VMs, the agent also needs to be installed on all instances of your bare metal hypervisor. For example on each VMware ESX server. The only additional step above and beyond straight server logging is making sure the division between the hypervisor machines, and their provisioned VMs is clear.

Extending log analysis beyond monitoring of individual components to management of the entire SDDC requires users to set up cloud-based log analysis solutions completely independent from the SDDC infrastructure in question. While IT professionals are accustomed to traditional practices of monitoring errors in log data, DevOps running SDDC must identify the underlying network components where the shift in system behavior occurs. And with advanced machine learning-based log management solutions, DevOps can resolve issues and optimize performance with greater effectiveness.

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

@DevOpsSummit Stories
When you're operating multiple services in production, building out forensics tools such as monitoring and observability becomes essential. Unfortunately, it is a real challenge balancing priorities between building new features and tools to help pinpoint root causes. Linkerd provides many of the tools you need to tame the chaos of operating microservices in a cloud native world. Because Linkerd is a transparent proxy that runs alongside your application, there are no code changes required. It even comes with Prometheus to store the metrics for you and pre-built Grafana dashboards to show exactly what is important for your services - success rate, latency, and throughput.
Druva is the global leader in Cloud Data Protection and Management, delivering the industry's first data management-as-a-service solution that aggregates data from endpoints, servers and cloud applications and leverages the public cloud to offer a single pane of glass to enable data protection, governance and intelligence-dramatically increasing the availability and visibility of business critical information, while reducing the risk, cost and complexity of managing and protecting it. Druva's award-winning solutions intelligently collect data, and unify backup, disaster recovery, archival and governance capabilities onto a single, optimized data set. As the industry's fastest growing data protection provider, Druva is trusted by over 4,000 global organizations, and protects over 40 PB of data. Join the conversation at twitter.com/druvainc
Kubernetes as a Container Platform is becoming a de facto for every enterprise. In my interactions with enterprises adopting container platform, I come across common questions: - How does application security work on this platform? What all do I need to secure? - How do I implement security in pipelines? - What about vulnerabilities discovered at a later point in time? - What are newer technologies like Istio Service Mesh bring to table?In this session, I will be addressing these commonly asked questions that every enterprise trying to adopt an Enterprise Kubernetes Platform needs to know so that they can make informed decisions.
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throughout enterprises of all sizes.
BMC has unmatched experience in IT management, supporting 92 of the Forbes Global 100, and earning recognition as an ITSM Gartner Magic Quadrant Leader for five years running. Our solutions offer speed, agility, and efficiency to tackle business challenges in the areas of service management, automation, operations, and the mainframe.