Welcome!

@DevOpsSummit Authors: Carmen Gonzalez, Mehdi Daoudi, Yeshim Deniz, Pat Romanski, Elizabeth White

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog

@DevOpsSummit: Blog Feed Post

Docker Swarm | @DevOpsSummit #DevOps #API #Docker #Microservices

Docker Swarm distributes containers to multiple nodes using various deployment strategies in the cluster scheduler

Docker Swarm: Collecting Metrics, Events and Logs
By Stefan Thies

Docker Swarm is a cluster manager for Docker.  When accessed via the Docker API by Docker API Clients or Docker command line tools, a Docker Swarm cluster looks just like a single Docker Host.  Docker Swarm distributes containers to multiple nodes using various deployment strategies in the cluster scheduler.

Having in mind that a Swarm cluster looks like a single Docker Host from the API point of view, it should be very easy to monitor Docker Swarm with existing Docker monitoring tools!  Connecting a monitoring agent to the Swarm Master API endpoint should do the job, right? The Sematext Docker Agent could simply collect all container metrics, events and all logs from the Swarm Master - should be a piece of cake. Hmm, but could there a gotcha?  It turns out there is more than one:

  • If we deploy a single monitoring agent to the master node, it would miss host metrics for all other nodes because the Docker API doesn't provide any host metrics. We could also not see how much memory, disk space or CPU the Docker Swarm node itself consumes. Solution: deploy the monitoring agents to each node for collecting the metrics locally.
  • Assuming a larger cluster with a high volume of logs, events and metrics to collect, a single monitoring agent connected to the the master node would need to handle all operational data of the cluster.  This would work for a small cluster but such an architecture would obviously be destined for failure on larger clusters.  Guess what the solution is? It's much better having an agent running on each node and distributing the monitoring and logging work over all nodes. If you do it right from the beginning, there is no need to change the deployment strategy later, when the cluster scales out.

DockerSwarmMonitoring

Monitoring container running on each Docker node

In the following example we assume that the master and agent nodes have the UNIX socket enabled in Docker daemon settings. This can be achieved by using -engine-env ‘DOCKER_OPTS="-H unix:///var/run/docker.sock"‘ in the docker-machine create command. Use this Github Gist to create a Docker-Swarm Cluster with with enabled UNIX sockets. Later, we will see this helps simplify the deployment of any tool that needs to connect to the local Docker daemon - including monitoring and logging containers.

Let's see how to deploy Sematext Agent to each node in a Docker Swarm Cluster with UNIX socket enabled in Docker-Daemon as just described.

When we started to work on Swarm Monitoring our first question was "Does Docker Swarm provide a deployment strategy for running exactly one instance of a service on each node?" We checked the documentation, but no dice.  We found strategies like "spread, binpack, and random" (see https://docs.docker.com/swarm/scheduler/strategy/), but none of them would guarantee exactly one instance of a service on each node. The "spread" strategy spreads the containers evenly over all hosts. The "binpack" strategy fills up one node after another with containers, while "random" spreads containers randomly to nodes. There was seemingly no strategy suitable for monitoring services running only once on each node.

So how can we distribute the monitoring container to each host using Docker Swarm instead of bash script iterating over all nodes?  It turns out it's possible to define an affinity to ensure that containers that should run on the same host are scheduled together. In our case we use "anti-affinity" in the deployment strategy, which instructs Swarm not to deploy the container with Sematext Agent to hosts that already have that container running. In other words, it tells Docker Swarm to run no more than one Sematext Agent container on each Docker host.  To do that we define a docker-compose.yml file with the "anti-affinity" specified in the container environment section:

sematext-agent:
image: 'sematext/sematext-agent-docker:latest' environment:
- LOGSENE_TOKEN=3b549a2c-653a-4832-xxx
- SPM_TOKEN=fe31fc3a-4660-47c6-xxx
- affinity:container!=sematext-agent*
privileged: true
restart: always
volumes:
- '/var/run/docker.sock:/var/run/docker.sock'

Finally, we use the docker-compose command to scale out the Sematext Docker Agent and deploy it to all Swarm cluster nodes.  To do that we run:

eval $(docker-machine env swarm-master --swarm)
docker-compose up -d
# scale is == num nodes
docker-compose scale sematext-agent=$(docker-machine ls | grep swarm | grep Running | wc -l)

After running the above commands, Sematext Docker Agent will be running on each node and within a minute you will receive Host and Container Metrics for all containers, all their Logs and all Docker events from all nodes in your Docker Swarm cluster.  Complete visibility!

Bildschirmfoto 2016-01-12 um 15.36.01

Aggregated Metrics from all Docker Swarm nodes

Please note there are many ways to create a Swarm cluster and you might have another setup, such as:

  • TLS secured Docker daemon and no possibility to activate the unix socket: In this situation you have to deal with the existing Docker daemon setup, which typically uses TLS and authentication via certificates (for example, if you followed Docker's instructions to create Swarm clusters using Docker-Machine). When the Docker socket is secured with TLS, each client - including Sematext Docker Agent - needs the certificates for authentication. This involves a bunch of parameters such as "DOCKER_HOST", "DOCKER_CERT_PATH", "DOCKER_TLS_VERIFY" and mounting of the certificate into the container. In addition we should know to which Docker daemon the agent should be connected (typically port 2375 for TCP, 2376 for TLS on each node and port 3376 on Swarm Master nodes for the Swarm API). We made this scenario easy with a deployment script for the Sematext Agent with TLS options provided by Docker-Machine.
  • You use CoreOS to run Docker Swarm: In this case you could use fleet and systemd to distribute the agent to each node (simply install Sematext Agent with these instructions)

The deployment methods above should work for other monitoring tools or logging containers as well because most of such tools need to run on each node to collect the metrics locally.

If you have questions or special needs for monitoring more complex setups feel free to contact us. The Sematext Docker Agent is a turnkey-solution for Docker Logs, Metrics and Events - sign up here and give it a try (30-days free trial, no credit card needed).

Filed under: Logging, Monitoring Tagged: Containers, devops, docker, docker swarm, log management, logging, performance monitoring

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

@DevOpsSummit Stories
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker containers gain prominence. He explored these challenges and how to address them, while considering how containers will influence the direction of cloud computing.
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help global organizations open new revenue streams, increase efficiencies, improve customer experience and ensure rapid time to market in the digital age. Only Hitachi Data Systems powers the digital enterprise by integrating the best information technology and operational technology from across the Hitachi family of companies. We combine this experience with Hitachi expertise in the internet of things to d...
Cloud promises the agility required by today’s digital businesses. As organizations adopt cloud based infrastructures and services, their IT resources become increasingly dynamic and hybrid in nature. Managing these require modern IT operations and tools. In his session at 20th Cloud Expo, Raj Sundaram, Senior Principal Product Manager at CA Technologies, will discuss how to modernize your IT operations in order to proactively manage your hybrid cloud and IT environments. He will be sharing best practices around collaboration, monitoring, configuration and analytics that will help you boost experience and optimize utilization of your modern IT Infrastructures.
Quickly find the root cause of complex database problems slowing down your applications. Up to 88% of all application performance issues are related to the database. DPA’s unique response time analysis shows you exactly what needs fixing - in four clicks or less. Optimize performance anywhere. Database Performance Analyzer monitors on-premises, on VMware®, and in the Cloud, including Amazon® AWS and Azure™ virtual machines.
Did you know that you can develop for mainframes in Java? Or that the testing and deployment can be automated across mobile to mainframe? In his session at @DevOpsSummit at 20th Cloud Expo, Vaughn Marshall, Sr. Principal Product Owner at CA Technologies, will discuss and demo how increasingly teams are developing with agile methodologies using modern development environments and automating testing and deployments, mobile to mainframe.
The goal of Continuous Testing is to shift testing left to find defects earlier and release software faster. This can be achieved by integrating a set of open source functional and performance testing tools in the early stages of your software delivery lifecycle. There is one process that binds all application delivery stages together into one well-orchestrated machine: Continuous Testing. Continuous Testing is the conveyor belt between the Software Factory and production stages. Artifacts are moved from one stage to the next only after they have been tested and approved to continue. New code submitted to the repository is tested upon commit. When tests fail, the code is rejected. Subsystems are approved as part of periodic builds on their way to the delivery stage, where the system is being tested as production ready. The release process stops when tests fail. The key is to shift test ...
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Business Unit at CA Technologies, will share his vision about the true ‘DevOps Royalty’ and how it will take a new breed of digital cloud craftsman, architecting new platforms with a new set of tools to achieve it. He will also present a number of important insights and findings from a recent cloud and DevOps study – outlining the synergies high performance teams are exploiting to gain significant busin...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore the ways that Nutanix technologies empower teams to react faster than ever before and connect teams in ways that were either too complex or simply impossible with traditional infrastructures.
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Valley. The program, to be aired during the peak viewership season of the year, will have a major impact on the industry in Japan. The film's director is writing a scenario to fit in the story in the next few days will be turned in to the network.
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value Stream Mapping for the development and operations tool chain by offering DevOps Tool Chain Integration and Traceability; DevOps Tool Chain Orchestration; and DevOps Insight and Intelligence. CollabNet also offers traditional application lifecycle management, ALM, for the enterprise through its TeamForge product.
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help global organizations open new revenue streams, increase efficiencies, improve customer experience and ensure rapid time to market in the digital age. Only Hitachi Data Systems powers the digital enterprise by integrating the best information technology and operational technology from across the Hitachi family of companies. We combine this experience with Hitachi expertise in the internet of things to d...
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in compute, storage and networking technologies, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and Embedded Systems worldwide. Supermicro is committed to protecting the environment through its “We Keep IT Green®” initiative and provides customers with the most energy-efficient, environmentally friendly solutions available on the market.
SYS-CON Events announced today that Juniper Networks (NYSE: JNPR), an industry leader in automated, scalable and secure networks, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Juniper Networks challenges the status quo with products, solutions and services that transform the economics of networking. The company co-innovates with customers and partners to deliver automated, scalable and secure networks with agility, performance and value.
Developers want to create better apps faster. Static clouds are giving way to scalable systems, with dynamic resource allocation and application monitoring. You won't hear that chant from users on any picket line, but helping developers to create better apps faster is the mission of Lee Atchison, principal cloud architect and advocate at New Relic Inc., based in San Francisco. His singular job is to understand and drive the industry in the areas of cloud architecture, microservices, scalability and availability. In a keynote presentation, he spoke to a standing-room-only crowd at New York's Cloud Expo about how highly available, highly scalable systems can help developers attain the goal of better apps faster.
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on quality and value.
Everyone wants to use containers, but monitoring containers is hard. New ephemeral architecture introduces new challenges in how monitoring tools need to monitor and visualize containers, so your team can make sense of everything. In his session at @DevOpsSummit, David Gildeh, co-founder and CEO of Outlyer, will go through the challenges and show there is light at the end of the tunnel if you use the right tools and understand what you need to be monitoring to successfully use containers in your environments.
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet the demands of Digital Transformation – including accelerating application delivery, closing feedback loops, enabling multi-channel delivery, empowering collaborative decisions, improving user experience, and ultimately meeting (and exceeding) business goals.
Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the USA and Europe, we work with a variety of customers from emerging startups to Fortune 1000 companies.
@DevOpsSummit at Cloud taking place June 6-8, 2017, at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 additional third-party data centers across Europe. Its full-service Unified ICT platform serves international enterprises and many of the world’s leading service providers, as well as governments and universities.
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no idea how to get a proper answer.
Cloud Expo, Inc. has announced today that Aruna Ravichandran, vice president of DevOps Product and Solutions Marketing at CA Technologies, has been named co-conference chair of DevOps at Cloud Expo 2017. The @DevOpsSummit at Cloud Expo New York will take place on June 6-8, 2017, at the Javits Center in New York City, New York, and @DevOpsSummit at Cloud Expo Silicon Valley will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.