Welcome!

@DevOpsSummit Authors: Liz McMillan, Pat Romanski, Stackify Blog, Elizabeth White, Jyoti Bansal

Related Topics: @DevOpsSummit, Linux Containers

@DevOpsSummit: Blog Feed Post

Elasticsearch Mapping and JSON Logs By @Sematext | @DevOpsSummit [#DevOps]

By default, Elasticsearch does a good job of figuring the type of data in each field of your logs

Using Elasticsearch Mapping Types to Handle Different JSON Logs
by Radu Gheorghe

By default, Elasticsearch does a good job of figuring the type of data in each field of your logs. But if you like your logs structured like we do, you probably want more control over how they’re indexed: is time_elapsed an integer or a float? Do you want your tags analyzed so you can search for big in big data? Or do you need it not_analyzed, so you can show top tags via the terms aggregation? Or maybe both?

In this post, we’ll look at how to use index templates to manage multiple types of logs across multiple indices. Also, we’ll explain how to use logging tools (such as Logstash and rsyslog) to handle JSON logging and specify types.

Elasticsearch Mapping and Logs
As you may already know, to control these things in Elasticsearch you’ll need to define a mapping. This works similarly in Logsene, our log analytics SaaS, because it uses Elasticsearch and exposes its API.

With logs you’ll probably use time-based indices, because they scale better (in Logsene, for instance, you get daily indices). That said, to make sure the mapping you define today applies to the index you create tomorrow, you need to define it in an index template.

Managing Multiple Types
Mappings provide a nice abstraction when you have to deal with multiple types of structured data. Let’s say you have two apps generating logs of different structures: both have a timestamp field, but one recording logins has a user field, and another one recording purchases has an amount field.

To deal with this, you can define the timestamp field in the _default_ mapping which applies to all types. Then, in each type’s own mapping we’ll define fields unique to that mapping. The following snippet is an example that works with Logsene, provided that aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee is your Logsene app token. If you roll your own Elasticsearch, you can use whichever name you want, and make sure the template applies to your index pattern.

curl -XPUT 'logsene-receiver.sematext.com/_template/aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee_MyTemplate' -d '{
"template" : "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee*",
"order" : 21,
"mappings" : {
"_default_" : {
"properties" : {
"timestamp" : { "type" : "date" }
}
},
"firstapp" : {
"properties" : {
"user" : { "type" : "string" }
}
},
"secondapp" : {
"properties" : {
"amount" : { "type" : "long" }
}
}
}
}'

Sending JSON Logs to Specific Types
When you send a document to Elasticsearch by using the API, you have to provide an index and a type. You can use an Elasticsearch client for your preferred language to log directly to Elasticsearch or Logsene this way. But I wouldn’t recommend this, because then you’d have to manage things like buffering if the destination is unreachable.

Instead, I’d keep my logging simple and use a specialized logging tool, such as Logstash or rsyslog to do the hard work for me. Logging to a file is usually the easiest option. It’s local, and you can have your logging tool tail the file and send contents over the network. I usually prefer sockets (like syslog) because they let me configure Logstash/rsyslog to:

  • Write events in a human format to a local file I can tail if I need to (usually in development)
  • Forward logs without hitting disk if I need to (usually in production)


Whatever you prefer, I think writing to local files or sockets is better than sending logs over the network from your application. Unless you’re willing to do a reliability trade-off and use UDP, which gets rid of most complexities.

Opinions aside, here’s a Logstash configuration for tailing a file with JSON logs separated by a newline. Here’s how you’d send those documents to Logsene via the Elasticsearch API:

input {
file {
path => "/var/log/test"
codec => "json"
}
}

output {
elasticsearch {
host => "logsene-receiver.sematext.com"
port => 80
index => "LOGSENE-APP-TOKEN-GOES-HERE"
index_type => "fileapp"
protocol => "http"
manage_template => false
}
}

Note how the JSON codec does the parsing here, instead of the more expensive and maintenance-heavy approach with grok that we’ve shown in an earlier post on getting started with Logstash. Some applications let you configure the log format, so you can make them write JSON (Apache httpd, for example).

If you want to send JSON over syslog, there’s the JSON-over-syslog (CEE) format that we detailed in a previous post. You can use rsyslog’s JSON parser module to take your structured logs and forward them to Logsene:

module(load="imuxsock") # can listen to local syslog socket
module(load="omelasticsearch") # can forward to Elasticsearch
module(load="mmjsonparse") # can parse JSON

action(type="mmjsonparse") # parse CEE-formatted messages

template(name="syslog-cee" type="list") { # Elasticsearch documents will contain   property(name="$!all-json") # all JSON fields that were parsed
}

action(
type="omelasticsearch"
template="syslog-cee" # use the template defined earlier
server="logsene-receiver.sematext.com"
serverport="80"
searchType="syslogapp"
searchIndex="LOGSENE-APP-TOKEN-GOES-HERE"
bulkmode="on" # send logs in batches
queue.dequeuebatchsize="1000" # of up to 1000
action.resumeretrycount="-1" # retry indefinitely (buffer) if destination is unreachable
)

To send a CEE-formatted syslog, you can run logger [email protected]: {“amount”: 50}’ for example. Rsyslog would forward this JSON to Elasticsearch or Logsene via HTTP. Note that Logsene also supports CEE-formatted JSON over syslog out of the box if you want to use a syslog protocol instead of the Elasticsearch API.

Filtering by Type
Once your logs are in, you can filter them by type (via the _type field) in Kibana:

Type Filtering with Kibana

However, if you want more refined filtering by source, we suggest using a separate field for storing the application name. This can be useful when you have different applications using the same logging format. For example, both crond and postfix use plain syslog.

If you’re looking for a place to send your logs to, check out Logsene!

Filed under: Logging, Search Tagged: elasticsearch, JSON, JSON logging, kibana, logging, logstash, mapping, mmjsonparse, omelasticsearch, rsyslog, structured logging, templates

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

@DevOpsSummit Stories
SYS-CON Events announced today that Twistlock, the leading provider of cloud container security solutions, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Twistlock is the industry's first enterprise security suite for container security. Twistlock's technology addresses risks on the host and within the application of the container, enabling enterprises to consistently enforce security policies, monitor and audit activity and identify and isolate threats in a container or cluster of containers.
SYS-CON Events announced today that Outscale, a global pure play Infrastructure as a Service provider and strategic partner of Dassault Systèmes, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2010, Outscale simplifies infrastructure complexities and boosts the business agility of its customers. Outscale delivers a secure, reliable and industrial strength solution for its customers, which include solopreneurs, fast-growing startups, large multi-national companies and government agencies. The company is a strategic partner to Dassault Systémes, and today powers hundreds of organizations throughout North America, Europe and Asia. Outscale’s U.S. headquarters is located in Jersey City, New Jersey, and its global headquarters is in Saint-Cloud, France.
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus internal resources on core competencies.
While some vendors scramble to create and sell you a fancy solution for monitoring your spanking new Amazon Lambdas, hear how you can do it on the cheap using just built-in Java APIs yourself. By exploiting a little-known fact that Lambdas aren’t exactly single threaded, you can effectively identify hot spots in your serverless code. In his session at 20th Cloud Expo, David Martin, Principal Product Owner at CA Technologies, will give a live demonstration and code walkthrough, showing how to overcome the challenges of monitoring S3 and RDS. This presentation will provide an overview of necessary Amazon Lambda concepts and discuss how to integrate the monitoring data with other tools.
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore the ways that Nutanix technologies empower teams to react faster than ever before and connect teams in ways that were either too complex or simply impossible with traditional infrastructures.
This talk centers around how to automate best practices in a multi-/hybrid-cloud world based on our work with customers like GE, Discovery Communications and Fannie Mae. Today’s enterprises are reaping the benefits of cloud computing, but also discovering many risks and challenges. In the age of DevOps and the decentralization of IT, it’s easy to over-provision resources, forget that instances are running, or unintentionally expose vulnerabilities.
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
Regardless of what business you’re in, it’s increasingly a software-driven business. Consumers’ rising expectations for connected digital and physical experiences are driving what some are calling the "Customer Experience Challenge.” In his session at @DevOpsSummit at 20th Cloud Expo, Marco Morales, Director of Global Solutions at CollabNet, will discuss how organizations are increasingly adopting a discipline of Value Stream Mapping to ensure that the software they are producing is poised to offer continuous improvements to customers’ experience of products and services.
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Business Unit at CA Technologies, will share his vision about the true ‘DevOps Royalty’ and how it will take a new breed of digital cloud craftsman, architecting new platforms with a new set of tools to achieve it. He will also present a number of important insights and findings from a recent cloud and DevOps study – outlining the synergies high performance teams are exploiting to gain significant busin...
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @CloudExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @ThingsExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
SYS-CON Events announced today that Carbonite will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Carbonite protects your entire IT footprint with the right level of protection for each workload, ensuring lower costs and dependable solutions with DoubleTake and Evault.
SYS-CON Events announced today that EARP Integration will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. EARP Integration is a passionate software house. Since its inception in 2009 the company successfully delivers smart solutions for cities and factories that start their digital transformation. EARP provides bespoke solutions like, for example, advanced enterprise portals, business intelligence systems and mobile applications for international enterprises across different sectors such as Energy and Utilities, GreenTech, MedTech, FinTech, Facility Management and Housing, Automotive Manufacturing, and Sport. EARP also cooperates with international software houses by providing them with highly qualified and well-selected, multilingual teams for bigger projects.
We build IoT infrastructure products - when you have to integrate different devices, different systems and cloud you have to build an application to do that but we eliminate the need to build an application. Our products can integrate any device, any system, any cloud regardless of protocol," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA
SYS-CON Events announced today that Outscale will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Outscale's technology makes an automated and adaptable Cloud available to businesses, supporting them in the most complex IT projects while controlling their operational aspects. You boost your IT infrastructure's reactivity, with request responses that only take a few seconds.
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs often get requests to connect these silos using technologies such as VPN; however, these tend to be difficult to manage and are not engineered for accessing business data from the cloud.
One of the biggest challenges with adopting a DevOps mentality is: new applications are easily adapted to cloud-native, microservice-based, or containerized architectures - they can be built for them - but old applications need complex refactoring. On the other hand, these new technologies can require relearning or adapting new, oftentimes more complex, methodologies and tools to be ready for production. In his general session at @DevOpsSummit at 20th Cloud Expo, Chris Brown, Solutions Marketing Manager at Nutanix, will explore how Nutanix is bringing these sides together - agility for pets, governance for cattle - in a single unified platform. With this combined approach, Nutanix removes silos - both technological and human - propelling your applications to a new level.
DevOps at Cloud Expo – being held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Among the proven benefits, DevOps is correlated with 20% faster time-to-market, 22% improvement in quality, and 18% reduction in dev and ops costs, according to research firm Vanson-Bourne. It is changing the way IT works, how businesses interact with customers, and how organizations are buying, building, and delivering software.
In his opening keynote at 20th Cloud Expo, Michael Maximilien, Research Scientist, Architect, and Engineer at IBM, will motivate why realizing the full potential of the cloud and social data requires artificial intelligence. By mixing Cloud Foundry and the rich set of Watson services, IBM's Bluemix is the best cloud operating system for enterprises today, providing rapid development and deployment of applications that can take advantage of the rich catalog of Watson services to help drive insights from the vast trove of private and public data available to enterprises.
As cloud adoption continues to transform business, today's global enterprises are challenged with managing a growing amount of information living outside of the data center. The rapid adoption of IoT and increasingly mobile workforce are exacerbating the problem. Ensuring secure data sharing and efficient backup poses capacity and bandwidth considerations as well as policy and regulatory compliance issues.
SYS-CON Events announced today that Cloud Academy will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud computing technologies. Get certified, manage the full lifecycle of your cloud-based resources, and build your knowledge based using Cloud Academy’s expert-created content, comprehensive Learning Paths, and innovative Hands-on Labs.
The 21st International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo | @ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs often get requests to connect these silos using technologies such as VPN; however, these tend to be difficult to manage and are not engineered for accessing business data from the cloud.
SYS-CON Events announced today that Interoute has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Interoute is the owner operator of Europe's largest network and a global cloud services platform, which encompasses over 70,000 km of lit fiber, 15 data centers, 17 virtual data centers and 33 colocation centers, with connections to 195 additional partner data centers. Our full-service Unified ICT platform serves startups and international enterprises, as well as every major European telecommunications service provider and major operators across the world, global internet giants, governments and universities.