Whenever I read about DevOps (which admittedly should be more but my job keeps me challenged in the office and my kids keep me challenged at home, so when I get downtime I prefer to switch off) it’s pretty much always by those from a Development background – we all know that the DevOps movement has its roots in Development, but how rewarding can it be for the Operations guys? (hint: very)
On behalf of Trainline I will be presenting at this year’s Continuous LifeCycle Conference in London on ‘When DevOps goes wrong’, so ahead of that I thought it would be a good idea to do a series of posts which give some background to my talk.
Note: DevOps at Trainline is not just about automating operations tasks, it’s about pushing the ownership for the code and infrastructure it runs on to the developers as much as possible. This means the whole range from discovery-development-deployment-live-monitoring\alerting-decommission. Not insignificant when you consider we have 27m web visitors per month, over 4.5m App visits per month and pull in 1.6bn in revenue per annum…
As an operations manager, DevOps is a massive mental shift. It means giving up control (bad word I know) and learning to live with a greater risk profile (queue arguments that Agile, shorter iterations, Continuous Delivery etc. actually reduces risk). That is not easy for those of us that have spent many years cutting our teeth on supporting enterprise systems and have the scars to prove it – mine come from 9 months of being the only person on call for a brand new transactional system that ran 24×7…
When I joined Trainline 3.5 years ago, I started as the Major Incident Manager, I had come from being the Head of Technical Operations at a smaller company and was bored and needed to do something new. At the time we were very traditional in that we had project based development teams with infrastructure and application support outsourced. I will spare you the pain of those years and the ups and downs of transition but safe to say we now have Product Aligned Development teams with full ownership, an in-house hot-shot Infrastructure team that is more focused on IaaS (through AWS) than Tin, and a scaled down ‘first responder’ team that has representatives in each Dev team. In our view, this is not yet DevOps but we are a lot closer than many other companies.
So, as an operations and infrastructure guy, how has this been for me? In all honesty – challenging (by which I mean hard, but not always in a bad way), frustrating but ultimately rewarding. As (now) Head of IS Operations, I still have ownership of the overall Availability, Reliability and Performance of the Trainline platform but I have no empire to rule from (quick stats: we used to have around 35 people doing Application Support alone, this is now down to 7, plus 6 others embedded in Dev teams), this means I manage through influence and relationships rather than command and control.
This has worked well and as an engineering department (which includes the Infra and Support teams), we have learnt a lot. We deliver value to our customers much quicker (months to days) and play a large part of the continued growth of the company. However, it has not been all plain-sailing and some of the fears that operations managers have about letting developers loose on production systems have happened. The purpose of my talk in May is to highlight these as a lesson for others, but also to show those in operations roles that DevOps is not something to be scared of, but should be embraced, mistakes happen but sometimes you have to fix forward rather than rollback.
Next time I will delve into the headline topics with more Clickbait titles like ‘When is an outage not regarded as an outage and other monitoring stories’, ‘The day IIS Reset was found in a deployment script’, ‘Patching, enough said’ and ‘How I annoyed my operations team with procrastination’.
I hope you will enjoy and attend the live version in May.