MWD0701: Log Management with ELK


In our series around modern web development, I’d like to touch on a vital component in the production pipeline, sitting in the area of debugging and monitoring (the MWD07 chapter), and that is Log management. Too often is this overlooked by most seasoned developers and dev managers, and that’s a real shame, because at all stages of the application life cycle Logs are a goldmine!

Obviously first and foremost for debugging purposes, at development and testing stages. But also later on, and once the application is in production, for performance monitoring, bug fixing purposes, and simply for usage analytics. There are a lot of logs available in a web stack, not to mention those that you will create and populate ad hoc for the verbose logging and overall auditability of your application: System logs, web server logs (access and errors), database logs, default framework-level logs (such as those you’ll get in Zend framework or Symfony for instance in the PHP arena), postfix and other mail logs, etc. All these deserve proper handling, rotation, storage and data-mining.

In my past life in agency-land, I had the opportunity to play with a variety of web log analysers such as AWStats, Webtrends and alike. I also used with reasonable success the community version of Splunk, and back then it seriously helped tracing back a couple of server hacks, but also providing custom stats around web campaigns to hungry marketers.

Now that I am working on one main web application with my current employer, I have been looking for a robust and sustainable solution to manage logs. And while looking along the lines of Logstash, a tool I used previously for a Java platform, I have discovered the new comprehensive solution now known as the ELK platform.

ELK stands for Elastic Search + Logstash + Kibana

Elastic Search has been around for a while, as a real-time search and analytics tool based on Lucene. Recently funded with a $70M C-round (press release), the company has undertaken the ambitious “Mission of Making it Simple for Businesses Worldwide to Obtain Meaningful Insights from Data”. Nothing less.

Logstash is this nice piece of software started 5 years ago, and maintained since then, by Jordan Sissel, a cheerful fellow developer also guilty some other nice nifty little utilities, such as the hand FPM. Logstash helps you take logs and other event data from your systems and store them in a central place. It is now commercially supported by ElasticSearch and Jordan Sissel has also joined the team.

And finally Kibana is a web fronted to visualise logs and time-stamped data. Produced by the vibrant Logstash community, and contributed in particular by early committer Rashid Khan, it is now commercially supported by Elastic Search as well, as the preferred visualisation and washboarding tool for Logstash and Elastic Search.


So how does it work? Well the diagram above will give you the gist of it:

  • Logstash processes log files as inputs, applies codecs and filters to it (note the amazing Grok library used as a middleware for reggae patterns) and spits out output files, including specific support for Elastic Search.
  • Elastic Search consumes Logstash outputs and generates search indexes.
  • Kibana offers the user-friendly interface anyone expects to build business-enabling reports and dashboards.
Sample Dashboard in Kibana 3

Sample Dashboard in Kibana 3

To get the full picture of the solution, there’s probably no better preacher than the creator himself, Jordan Sissel, who has been a faithful contributor at PuppetConf for the last 3 years, check out these Youtube recordings:

Useful links: