Welcome!

@DevOpsSummit Authors: Liz McMillan, Elizabeth White, Pat Romanski, Karthick Viswanathan, PagerDuty Blog

Related Topics: @DevOpsSummit, @CloudExpo, @BigDataExpo

@DevOpsSummit: Blog Feed Post

Parsing and Centralizing Elasticsearch Logs By @Sematext | @DevOpsSummit [#DevOps]

How to use Logstash’s file input to tail the main Elasticsearch log and the slowlogs

No, it’s not an endless loop waiting to happen, the plan here is to use Logstash to parse Elasticsearch logs and send them to another Elasticsearch cluster or to a log analytics service like Logsene (which conveniently exposes the Elasticsearch API, so you can use it without having to run and manage your own Elasticsearch cluster).

If you’re looking for some ELK stack intro and you think you’re in the wrong place, try our 5-minute Logstash tutorial. Still, if you have non-trivial amounts of data, you might end up here again. Because you’ll probably need to centralize Elasticsearch logs for the same reasons you centralize other logs:

  • to avoid SSH-ing into each server to figure out why something went wrong
  • to better understand issues such as slow indexing or searching (via slowlogs, for instance)
  • to search quickly in big logs

In this post, we’ll describe how to use Logstash’s file input to tail the main Elasticsearch log and the slowlogs. We’ll use grok and other filters to parse different parts of those logs into their own fields and we’ll send the resulting structured events to Logsene/Elasticsearch via the elasticsearch output. In the end, you’ll be able to do things like slowlog slicing and dicing with Kibana:

logstash_elasticsearch

TL;DR note: scroll down to the FAQ section for the whole config with comments.

Tailing Files
First, we’ll point the file input to *.log from Elasticsearch’s log directory. This will work nicely with the default rotation, which renames old logs to something like cluster-name.log.SOMEDATE. We’ll use start_position => “beginning”, to index existing content as well. We’ll add the multiline codec to parse exceptions nicely, telling it that every line not starting with a [ sign belongs to the same event as the previous line.

input {
file {
path => "/var/log/elasticsearch/*.log"
type => "elasticsearch"
start_position => "beginning"
codec => multiline {
pattern => "^\["
negate => true
what => "previous"
}
}
}

Parsing Generic Content
A typical Elasticsearch log comes in the form of:

[2015-01-13 15:42:24,624][INFO ][node ] [Atleza] starting ...

while a slowlog is a bit more structured, like:

[2015-01-13 15:43:17,160][WARN ][index.search.slowlog.query] [Atleza] [aa][3] took[19.9ms], took_millis[19], types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"query":{"term":{"a":2}}}], extra_source[],

But fields from the beginning, like timestamp and severity, are common, so we’ll parse them first:

grok {
match => [ "message", "\[%{TIMESTAMP_ISO8601:timestamp}\]\[%{DATA:severity}%{SPACE}
\]\[%{DATA:log_source}%{SPACE}\]%{SPACE}\[%{DATA:node}\]%{SPACE}(?(.|\r|\n)*)" ]
overwrite => [ "message" ] }

For the main Elasticsearch logs, the message field now contains the actual message, without the timestamp, severity, and log source, which are now in their own fields.

Parsing Slowlogs
For slowlogs, the message field now looks like this:

[aa][3] took[19.9ms], took_millis[19], types[], stats[], search_type[QUERY_THEN_FETCH], total_shards[5], source[{"query":{"term":{"a":2}}}], extra_source[],

First we’ll parse the index name and the shard number via grok, then the kv filter will take care of the name-value pairs that follow:

if "slowlog" in [path] {
grok {
match => [ "message", "\[%{DATA:index}\]\[%{DATA:shard}\]%{GREEDYDATA:kv_pairs}" ]
}
kv {
source => "kv_pairs"
field_split => " \],"
value_split => "\["
}
}

Some Cleanup
Now our logs are fully parsed, but there are still some niggles to take care of. One is that each log’s timestamp (the time logged by the application) is in the timestamp field, while the standard @timestamp was added by Logstash when it read that event. If you want @timestamp to hold the application-generated timestamp, you can do it with the date filter:

date {
"match" => [ "timestamp", "YYYY-MM-DD HH:mm:ss,SSS" ]
target => "@timestamp"
}

Other potentially annoying things:

  • at this point, timestamp contains the same data as @timestamp
  • the content of kv_pairs from slowlogs is already parsed by the kv filter
  • the log type (for example, index.search.slowlog.query) is in a field called log_source, to make room for a field called source which stores other things (the JSON query, in this case). I would rather store index.search.slowlog.query in source, especially if I’m using the Logsene UI, where I can filter on sources by clicking on them
  • the grok and kv filters parse all fields as strings. Even if some of them, like took_millis, are numbers

To fix all of the above (remove, rename and convert fields) we’ll use the mutate filter:

mutate {
remove_field => [ "kv_pairs", "timestamp" ]
rename => {
"source" => "source_body"
"log_source" => "source"
}
convert => {
"took_millis" => "integer"
"total_shards" => "integer"
"shard" => "integer"
}
}

Sending Events to Logsene/Elasticsearch
Below is an elasticsearch output configuration that works well with Logsene and Logstash 1.5.0 beta 1. For an external Elasticsearch cluster, you can simply specify the host name and protocol (we recommend HTTP because it’s easier to upgrade both Logstash and Elasticsearch):

output {
elasticsearch {
host => "logsene-receiver.sematext.com"
ssl => true
port => 443
index => "LOGSENE-TOKEN-GOES-HERE"
protocol => "http"
manage_template => false
}
}

If you’re using Logstash 1.4.2 or earlier, there’s no SSL support, so you’ll have to remove the ssl line and set port to 80.

FAQ

Q: Cool, this works well for logs. How about monitoring Elasticsearch metrics like how much heap is used or how many cache hits I get?
A: Check out our SPM, which can monitor lots of applications, including Elasticsearch. If you’re a Logsene user, too, you’ll be able to correlate logs and metrics
Q: I find this logging and parsing stuff is really exciting.
A: Me too. If you want to join us, we’re hiring worldwide
Q: I’m here from the TL;DR note. Can I get the complete config?
A: Here you go (please check the comments for things you might want to change)

input {
file {
path => "/var/log/elasticsearch/*.log"  # tail ES log and slowlogs
type => "elasticsearch"
start_position => "beginning"  # parse existing logs, too
codec => multiline {   # put the whole exception in a single event
pattern => "^\["
negate => true
what => "previous"
}
}
}

filter {
if [type] == "elasticsearch" {
grok {  # parses the common bits
match => [ "message", "\[%{TIMESTAMP_ISO8601:timestamp}\]\[%{DATA:severity}%{SPACE}
\]\[%{DATA:log_source}%{SPACE}\]%{SPACE}\[%{DATA:node}\]%{SPACE}(?<message>(.|\r|\n)*)" ]
overwrite => [ "message" ]
}

if "slowlog" in [path] {  # slowlog-specific parsing
grok {  # parse the index name and the shard number
match => [ "message", "\[%{DATA:index}\]\[%{DATA:shard}\]%{GREEDYDATA:kv_pairs}" ]
}
kv {    # parses named fields
source => "kv_pairs"
field_split => " \],"
value_split => "\["
}
}

date {  # use timestamp from the log
"match" => [ "timestamp", "YYYY-MM-DD HH:mm:ss,SSS" ]
target => "@timestamp"
}

mutate {
remove_field => [ "kv_pairs", "timestamp" ]  # remove unused stuff
rename => {  # nicer field names (especially good for Logsene)
"source" => "source_body"
"log_source" => "source"
}
convert => {  # type numeric fields (they're strings by default)
"took_millis" => "integer"
"total_shards" => "integer"
"shard" => "integer"
}
}

}
}

output {
elasticsearch {   # send everything to Logsene
host => "logsene-receiver.sematext.com"
ssl => true  # works with Logstash 1.5+
port => 443  # use 80 for plain HTTP
index => "LOGSENE-APP-TOKEN-GOES-HERE"  # fill in your token (click Integration from your Logsene app)
protocol => "http"
manage_template => false
}
}

Filed under: Logging Tagged: elasticsearch, grok, kibana, log analytics, log management, logging, logsene, logstash, parsing, slowlog

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

@DevOpsSummit Stories
SYS-CON Events announced today that Keisoku Research Consultant Co. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Keisoku Research Consultant, Co. offers research and consulting in a wide range of civil engineering-related fields from information construction to preservation of cultural properties. For more information, visit http://www.krcnet.co.jp/eng_site/e_index.htm.
SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp empowers global organizations to unleash the full potential of their data to expand customer touchpoints, foster greater innovation and optimize their operations.
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data and analytics insights onto a single, holistic, display, focusing attention on what matters, when it matters.
"NetApp's vision is how we help organizations manage data - delivering the right data in the right place, in the right time, to the people who need it, and doing it agnostic to what the platform is," explained Josh Atwell, Developer Advocate for NetApp, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that SourceForge has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. SourceForge is the largest, most trusted destination for Open Source Software development, collaboration, discovery and download on the web serving over 32 million viewers, 150 million downloads and over 460,000 active development projects each and every month.
SYS-CON Events announced today that DXWorldExpo has been named “Global Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Digital Transformation is the key issue driving the global enterprise IT business. Digital Transformation is most prominent among Global 2000 enterprises and government institutions.
One of the biggest challenges with adopting a DevOps mentality is: new applications are easily adapted to cloud-native, microservice-based, or containerized architectures - they can be built for them - but old applications need complex refactoring. On the other hand, these new technologies can require relearning or adapting new, oftentimes more complex, methodologies and tools to be ready for production. In his general session at @DevOpsSummit at 20th Cloud Expo, Chris Brown, Solutions Marketing Manager at Nutanix, explored how Nutanix is bringing these sides together - agility for pets, governance for cattle - in a single unified platform. With this combined approach, Nutanix removes silos - both technological and human - propelling your applications to a new level.
Most of the time there is a lot of work involved to move to the cloud, and most of that isn't really related to AWS or Azure or Google Cloud. Before we talk about public cloud vendors and DevOps tools, there are usually several technical and non-technical challenges that are connected to it and that every company needs to solve to move to the cloud. In his session at 21st Cloud Expo, Stefano Bellasio, CEO and founder of Cloud Academy Inc., will discuss what the tools, disciplines, and cultural aspects are that enterprise companies are considering to get to the cloud and eventually transform the way they build software and services.
DevOps at Cloud Expo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
With the rise of DevOps, containers are at the brink of becoming a pervasive technology in Enterprise IT to accelerate application delivery for the business. When it comes to adopting containers in the enterprise, security is the highest adoption barrier. Is your organization ready to address the security risks with containers for your DevOps environment? In his session at @DevOpsSummit at 21st Cloud Expo, Chris Van Tuin, Chief Technologist, NA West at Red Hat, will discuss: The top security risks with containers and how to manage these risks at scale including Images, Builds, Registry, Deployment, Hosts, Network, Storage, APIs, Monitoring/Logging, and Federation.
SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
DevOps at Cloud Expo – being held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits, DevOps is correlated with 20% faster time-to-market, 22% improvement in quality, and 18% reduction in dev and ops costs, according to research firm Vanson-Bourne. It is changing the way IT works, how businesses interact with customers, and how organizations are buying, building, and delivering software.
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Your clients expect transactions to never fail, cloud access to be fast and always on, and their data to be protected - no exceptions. Hear about how Secure Service Container (SSC), an IBM-exclusive open technology, enables secure building and hosting of next-generation applications, both cloud and on-premises. SSC protects the full stack from external and insider threats, allows automatic encryption of data in-flight and at-rest, and is tamper-resistant during installation and runtime – with no changes to applications required.
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of engineers can navigate the Carrier Ecosystem for your IT team acting as an extension of your business, producing a hassle-free experience.
SYS-CON Events announced today that Suzuki Inc. will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Suzuki Inc. is a semiconductor-related business, including sales of consuming parts, parts repair, and maintenance for semiconductor manufacturing machines, etc. It is also a health care business providing experimental research for dementia, etc. For more information, visit http://www.e-suzuki.co.jp/en/.
SYS-CON Events announced today that Fusic will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Fusic Co. provides mocks as virtual IoT devices. You can customize mocks, and get any amount of data at any time in your test. For more information, visit https://fusic.co.jp/english/.
SYS-CON Events announced today that Ryobi Systems will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Ryobi Systems Co., Ltd., as an information service company, specialized in business support for local governments and medical industry. We are challenging to achive the precision farming with AI. For more information, visit http://www.ryobi-sol.co.jp/en/.
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex to learn. This is because Kubernetes is more of a toolset than a ready solution. Hence it’s essential to know when and how to apply the appropriate Kubernetes constructs.
SYS-CON Events announced today that Interface Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Interface Corporation is a company developing, manufacturing and marketing high quality and wide variety of industrial computers and interface modules such as PCIs and PCI express. For more information, visit http://www.interface-amita.com/aboutus/interface_profile.asp.
SYS-CON Events announced today that Mobile Create USA will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Mobile Create USA Inc. is an MVNO-based business model that uses portable communication devices and cellular-based infrastructure in the development, sales, operation and mobile communications systems incorporating GPS capability.
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak.co.jp/en/.
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists examined how DevOps helps to meet the demands of Digital Transformation – including accelerating application delivery, closing feedback loops, enabling multi-channel delivery, empowering collaborative decisions, improving user experience, and ultimately meeting (and exceeding) business goals.
Cloud Expo, Inc. has announced today that Andi Mann and Aruna Ravichandran have been named Co-Chairs of @DevOpsSummit at Cloud Expo Silicon Valley which will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. "DevOps is at the intersection of technology and business-optimizing tools, organizations and processes to bring measurable improvements in productivity and profitability," said Aruna Ravichandran, vice president, DevOps product and solutions marketing, CA Technologies. "It's this results-driven combination of technology and business that makes me so passionate about DevOps and its future in the industry. I am truly honored to take on this co-chair role, and look forward to working with the DevOps Summit team at Cloud Expo and attendees to advance DevOps."