In this second Logstash post of three on how to get logs in a readable format, I will talk about the basic principles:
Elements automatically provided by the file plugin
The interesting ones are
There are others.
Thus just by using the file plugin you will get you log entry returned in message and @timestamp will be set to when the log entry was read into the system by its input plugin.
The date plugin explained
If you need to replay logs then you really want to see them in Kibana using the date of when the log entry was generated not when they are (reread) into the logging system. Thus you need to overwrite timestamp and the date plugin does this.
I found its syntax a bit cryptic. It is
match [ "logdate", "ISO8601" ]
locale => "en"
remove_field => "logdate"
This is saying that there is is pre-existing field called logdate that has already been extracted from the log entry that contains the time that the log entry ocurred. This date happens to be in ISO8601 format and the locale variant in use in ‘en”.
This allows logstash to be able to use it and it uses it to overwrite the standard field @timestamp.
The format of the date must match exactly the format of where the datetime is storied eg from parsing an IIS log I had a date as 15-12-10 13:33:20.113 and I tried to match it to “YY-MM-dd HH:mm:ss” and it failed – the format needed to be “YY-MM-dd HH:mm:ss.SSS”
Since we now have the relevant datetime stored away in @timestamp we may as well save a bit of space and delete logdate (or whatever you stored the log create datetime in) as it is now duplicating information and taking up space.
The caps and lower case in date formats explained
I was confused as to why some of the above date format was in upper case and some was in lower case. This seems to explain http://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html
Our logs are created in JSON format and so we use the JSON filter plugin to parse the log entry.
This was another command I found confusing until it is explained and then it is really obvious
source => "message"
remove_field => "message"
The json plug needs to be told what to parse. It needs its source parameter set and it is set to the message that was created by the file plugin.
It will read that message and breakout every json element within it into a separate field. Once we have a a breakout of every field we no longer need the aggregated message so we can remove it.
What made life confusing for us was that in the json message, the logged date was called “date” and there was an element called “message”.
Thus we had
match [ “date”, “ISO8601” ]
locale => “en”
remove_field => “date”
The “date” in the match comes from the json field in the message and is co-incidently the same text as the date plugin
Similarly the “message” in the json command is the message created by the file plugin. However since we had a json element called message that overrode the file created message (after the json plugin had done its job) and if we left in the remove_field => “message” it will remove that text as well.