Spotting problems and fixing them before our customers notice


Just over a year ago we were set the challenge of ‘spotting problems and fixing them before our customers notice’. To spot the problems we knew we had to change our behavior from only opening our logs when an issue was suspected, to making them available at the touch of a button to our engineers so it was easy for them to monitor their applications and spot problems and fix them.

Ash Powell’s blog explains how we did this, ‘What the ELK!? – Log Aggregation,‘ and here’s a short film we’ve just completed with Elastic to explain our business and how ELK’s role:

Keeping trainline on track

Challenges

Implementing ELK hasn’t all been straightforward. Although getting up and running was straightforward, the challenge has been to scale and find the best way to collect the logs from the application servers. We’ve recently moved our main platform into Amazon’s cloud AWS, and as we’ve done this it’s highlighted that we need to find a way to more efficiently deploy ELK, and Logstash the agent that collects the logs to keep compute resource use as low as we can.

Elastic are giving us support with this, and also encouraging us to consider their new managed ‘Cloud’ aka ‘Found’ service, that would mean we subscribe to ELK as a service they’d manage, leaving us to get on with what we’re good at which is helping customers make smarter journeys, selling train tickets.

Beyond spotting problems…

With spotting problems covered, or the foundations at least for product teams the build on. As project manager I’ve moved on to work on refreshing our phone system, but personally I’m really excited about using the drive for product teams to collect logs and features in Kibana (the K in ELK) to map our customers journeys. I committed to this in my blog, ‘An idea to bring our business alive with maps’.

We’re assembling a new team of data gurus here at trainline who I’m hoping will help me with this, together with our ‘Routemaster’ team who produce our customers journey plans. With trainline’s first data scientist joining the company a few weeks ago and agreement from Mark Harwood @elasticmark a developer at Elastic, who’s done similar work for Dept for Transport (DoT), to present at one of our Burrito’s. Together with a tip off from Danny @asyncable_ship about track mapping information in one of the rail APIs. We’ve got the right ingredients, so I hope to be blogging you with news of some success before long.

Kit Reynolds
@kitreno

Further reading – Keeping trainline on track on elastic.co

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s