Deployment Agility with Air-Traffic Control

iPad Brand

From Change Control to Assumed Approval: how we first managed the operational visibility of Continuous Delivery and how it’s still in use 2 years later

Trainline has changed in many ways over the last 2½ years and, as a 4-year veteran, I have been ideally placed to watch and help enable that change. One of the big changes was from a project-led to a product-led organisation. Along with that comes lots of things, one of which is Continuous Delivery (CD). The advantages of this are well known, and one excellent stat was recently produced that showed that we have achieved a:

122-fold improvement in deployment agility!

Continue reading

Dockerize your WebdriverIO environment to run everywhere

With functional tests being an integral part of a webapp workflow, we should always try to find ways to make them run smoother and make our lives easier.

My dilemma

I’ve been working with Selenium Webdriver/WebdriverIO for years now, but I have always thought: “Wouldn’t it be great not to need a Selenium server running before starting my tests?”

This may seem like a minor problem, but it means having another tab open in my terminal, starting/stopping that process, and it all gets far more difficult when you try to automate it in a CI environment. In addition to that, you need Java installed to run the Selenium server (or you can use the selenium-standalone npm package, which removes the dependency from Java but still needs to be started/stopped). Continue reading

Trainline Environment Manager – Now Open Source!

Background

Trainline is Europe’s leading independent Rail ticket retailer, selling £2.3bn tickets per year and enabling our customers to make more than 100,000 smarter journeys every day. We have 150 development staff who are constantly improving our user experiences, and our need to innovate means that we cannot allow the underlying infrastructure to be a constraint on time to market.

This desire for infrastructure agility recently led us to migrate 100% of our Development, Staging, UAT and Production environments from legacy private data-centre to Amazon’s public Cloud. Simply lifting and shifting components into the Cloud would have improved agility somewhat, but for us this was just the starting point. Continue reading

Supercharging your Production Monitoring

At Trainline our development teams have moved on a lot from what we were doing a while back to what and how we are doing things today. Here are just a few of the things that have completely changed in just the last couple of years: moving to continuous delivery, massive increase in automation testing, new infrastructure, green-blue deployments, load balancing, alerting and monitoring.

What has taken my interest in the last year is the extent of the monitoring that we have available now and how we need to choose what to look at and what not to. Continue reading

DevOps when you are not a Dev (@Trainline)

Whenever I read about DevOps (which admittedly should be more but my job keeps me challenged in the office and my kids keep me challenged at home, so when I get downtime I prefer to switch off) it’s pretty much always by those from a Development background – we all know that the DevOps movement has its roots in Development, but how rewarding can it be for the Operations guys? (hint: very) Continue reading

New Relic in action at trainline #futurestack

Toward the end of last year I was invited to present at New Relic’s FutureStack conference in their Hacker Lounge track. I had a great time and it was great to introduce trainline to those across the pond and further afield.

All the sessions were filmed and mine has made it onto Youtube. The talk was on how we at the trainline adopted and scaled the use of New Relic across the entire organisation and the value that we’re getting out of it, including how team KPIs are judged on some of the metrics we get out of New Relic.

Video

Paul Kiddie – New Relic in action at trainline Continue reading

Keeping New Relic new

Paul, @pkiddie wrote a great blog about how we’ve made our customers happier – and engineers using New Relic. We were also given the opportunity to present some of these ideas at a recent New Relic User Group meetup in London.

New Relic London Meetup May 2015 trainline slides.

Now with New Relic ticking along nicely, and dashboards all up next to our product teams so they can see the error rate, revenue and response times. We wanted to make sure we kept up to date with the latest features from New Relic. (Those updates have been arriving up to twice a month for the .Net APM agent we’re using.) As a cloud hosted platform, New Relic is continuously updated and improved, and we want to get the most out of our investment in it by keeping up to date with each release. It also makes our product teams happy when they get to play around with the latest new features. Continue reading

Scripting Language and Region settings for current user, default user and welcome screen

At first glance, this seems like a simple and straight forward task; set the language and region for the current user, default user and welcome screen using command line tools. It’s easily done with the GUI, and Microsoft like to ensure that anything that can be done in the GUI can also be done in PowerShell, so this should be easy, no? Sadly, no.

Continue reading

What the ELK!? – Log Aggregation

Everyone loves logs right?

No…

Logs are long, complex, full of useless information, and it takes ages for you to find that one error message that you need to fix a problem. So if you’re working with over 100 servers and you’re getting over 200GB of logs a day how can you get through your logs to find the real information inside?

Continue reading

Building AMIs with Packer

During the planning stages of our migration to AWS, we identified the need to create custom images (AMI’s) as the base for new instances. While we are relatively experienced with Chef, we found that running Chef at instance launch time was much longer than acceptable. Creating custom AMI’s that are preconfigured (known as baking) allowed us to shift the heavy lifting from instance launch time to an earlier, out-of-band time.

In designing this process, we came up with multiple goals – we needed to have a reliable, repeatable, auditable and tested process with a fast spin-up time. This post explores our recent infrastructure automation efforts in this area.

Continue reading