Welcome!

@DevOpsSummit Authors: Yeshim Deniz, Pat Romanski, Liz McMillan, Zakia Bouachraoui, Elizabeth White

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog, SDN Journal

@DevOpsSummit: Blog Feed Post

Elasticsearch Monitoring By @Sematext | @DevOpsSummit [#DevOps]

Some Elasticsearch users evaluate SPM and compare it to Marvel from Elasticsearch

While many SPM Performance Monitoring users quickly see the benefits of SPM and adopt it in their organizations for monitoring — not just for Elasticsearch, but for their complete application stack — some Elasticsearch users evaluate SPM and compare it to Marvel from Elasticsearch.  We’ve been asked about SPM vs. Marvel enough times that we decided to put together this focused comparison to show some of the key differences and help individuals and organizations pick the right tool for their needs.

Marvel is a relatively young product that provides a detailed visualization of Elasticsearch metrics in a Kibana-based UI. It installs as an Elasticsearch plug-in and includes ‘Sense’ (a developer console), plus a replay functionality for shard allocation history.

SPM, on the other hand, offers multiple agent deployment modes, has both Cloud and On Premises versions, includes alerts and anomaly detection, is not limited to Elasticsearch monitoring, integrates with third party services, etc. The following Venn diagram shows key areas that SPM and Marvel have in common and also the areas where they differ.

SPM-vs-Marvel

Looking into the details surfaces many notable differences.  For example:

  • The SPM agent can run independently from the Elasticsearch process and an upgrade of the agent does not require a restart of Elasticsearch
  • Dashboards are defined with different philosophies: Marvel exposes each Metric in a separate chart, while SPM groups related metrics together in a single chart or in adjacent charts (thus making it easy for people to have more information in a single place without needing to jump between multiple views)
  • Both have the ability to show metrics from multiple nodes in a single chart: Marvel draws a separate line for each node, while in SPM you can choose to aggregate values or display them separately.

The following “SPM vs. Marvel Comparison Table” is a starting point to evaluate monitoring products for organization’s individual needs.

SPM vs. Marvel Comparison Table

Feature SPM by Sematext Marvel by Elasticsearch
Supported Applications Elasticsearch, Hadoop, Spark, Kafka, Storm, Cassandra, HBase, Redis, Memcached, NGINX(+), Apache, MySQL, Solr, AWS CloudWatch, JVM, … Elasticsearch
Agent deployment mode in- and out-of-process
(out-of-process allows for seamless updates without requiring Elasticsearch restarts)
in-process
(as Elasticsearch plug-in; updates require Elasticsearch restarts)
Predefined dashboard graphs organized in groups YES YES
Saving Individual Dashboards Each user can store multiple dashboards, mixing charts from all applications, including both metrics and logs. Current view can be saved, reset to defaults possible. These changes are global.
API for Custom Metrics and Business KPIs YES NO
Extra Elasticsearch Metrics NO

 

  • Metrics are added based on user demand and users  can always graph them as Custom Metrics.
YES

 

  • Circuit Breakers
  • ID Cache
  • Lucene memory
  • ES Threadpools
  • Percolator
OS and JVM Metrics YES (+)

 

  • JVM pool sizes
  • JVM pool utilization
YES
Correlation of Metrics with Logs, Events, Alerts, and Anomalies YES

 

  • SPM and Logsene integration
  • Ability to ingest and chart arbitrary external Events
NO

 

  • Cluster Pulse displays only Elasticsearch Events
Deployment model SaaS or On Premises On Premises
Security/User Roles &
Permissions
YES NO
Easy & Secure Sharing of Reports with internal and external organizations YES

 

  • via short links
  • vie embeds / iframe
  • via email
NO
Machine Learning-based Anomaly Detection YES NO
Threshold based Alerts YES NO
Heartbeat Alerts YES NO
Forwarding Alerts to 3rd parties YES

 

  • E-Mail
  • PagerDuty
  • Nagios / Shinken
  • HipChat
  • Slack
  • Webhooks
NO
Metrics Aggregation YES

 

  • Pre-aggregation at multiple granularity levels, including 1 min granularity.  Advantage: more efficient storage, scales better, faster for graphing performance over longer time periods at the expense of sub-minute precision.
YES

 

  • Query-time aggregation. No write or query-time aggregation.
    Advantage: 10 second precision by default at the expense of storage size, write, and read performance and memory footprint.

As an aside, most of the features in this comparison table would also apply if we compared SPM to BigDesk, ElasticHQ, Statsd, Graphite, Ganglia, Nagios, Riemann, and other application-specific monitoring or alerting tools out there.

If you have any questions about this comparison or have any feedback, please let us know!

Filed under: Monitoring Tagged: elasticsearch, Marvel, spm, SPM Performance Monitoring

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

@DevOpsSummit Stories
Hackers took three days to identify and exploit a known vulnerability in Equifax’s web applications. I will share new data that reveals why three days (at most) is the new normal for DevSecOps teams to move new business /security requirements from design into production. This session aims to enlighten DevOps teams, security and development professionals by sharing results from the 4th annual State of the Software Supply Chain Report -- a blend of public and proprietary data with expert research and analysis.Attendees can join this session to better understand how DevSecOps teams are applying lessons from W. Edwards Deming (circa 1982), Malcolm Goldrath (circa 1984) and Gene Kim (circa 2013) to improve their ability to respond to new business requirements and cyber risks.
DXWorldEXPO LLC announced today that Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, software-defined solution with rich machine intelligence.
Dynatrace is an application performance management software company with products for the information technology departments and digital business owners of medium and large businesses. Building the Future of Monitoring with Artificial Intelligence. Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like "How is my application doing" but no idea how to get a proper answer.
This session will provide an introduction to Cloud driven quality and transformation and highlight the key features that comprise it. A perspective on the cloud transformation lifecycle, transformation levers, and transformation framework will be shared. At Cognizant, we have developed a transformation strategy to enable the migration of business critical workloads to cloud environments. The strategy encompasses a set of transformation levers across the cloud transformation lifecycle to enhance process quality, compliance with organizational policies and implementation of information security and data privacy best practices. These transformation levers cover core areas such as Cloud Assessment, Governance, Assurance, Security and Performance Management. The transformation framework presented during this session will guide corporate clients in the implementation of a successful cloud solu...
So the dumpster is on fire. Again. The site's down. Your boss's face is an ever-deepening purple. And you begin debating whether you should join the #incident channel or call an ambulance to deal with his impending stroke. Yes, we know this is a developer's fault. There's plenty of time for blame later. Postmortems have a macabre name because they were once intended to be Viking-like funerals for someone's job. But we're civilized now. Sort of. So we call them post-incident reviews. Fires are never going to stop. We're human. We miss bugs. Or we fat finger a command - deleting dozens of servers and bringing down S3 in US-EAST-1 for hours - effectively halting the internet. These things happen.