Alexa Vass

Sep 10, 2020

4 min read


I was a lifehacker since I was knee-high to a grasshopper. I always wanted to automate things in my life, so I decided to learn IT. I thought some of my routine work took too much time and I always asked myself why there is no machine for those things. Maybe that’s why my favorite topic regarding my profession are processes that can be done smarter, aka toils.

Toil is the kind of work tied to running a production service that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows.” [Service Reliability Engineering, 2016, p.49]

If you are bored or if you feel you would rather not do something, that thing could be a toil. If you think that could be a script or a job in your CI or that could be done by your review/build job, that’s a hint for automation. In this case my magic number is 3. If something comes up 3 times, I start to think about universal solutions, automation, self-services, or in the worst case a documentation, which can lead to a solution.

Toils are bad, mkay?

Toils are terrible things for engineers. They are boring, they come up from time to time, they waste a lot of time and if you face them, you want to run away. Also, too much work by routine can lead to burnout, and burnout leads to giving a fortune to a psychologist.

As the book Seeking SRE specifies the result of toils can be:

“Discontent and a lack of feeling of accomplishment, Burnout, More errors, leading to time-consuming rework to fix, no time to learn new skills, career stagnation (hurt by lack of opportunity to deliver value-adding projects.” [Seeking SRE, 2018, p.149]

I want to point to the “no time to learn new things” part too, because that means that’s a deficit not just for the engineer but for the company too, since if the engineers do not improve, they will have worse and worse solutions, also they may leave a company in the hope of gather experience at another company and have another, more exciting job.

Moreover, Seeking SRE gathered the disadvantages of toils for a company too:

“Constant shortages of team, excessive operational support costs, inability to make progress on strategic initiatives (the “everybody is busy, but nothing is getting done “ syndrome), inability to train top talent (and acquire top talent after word gets out about how the organization functions).” [Seeking SRE, 2018, p.150]

You may think, that ‘Yeah, but the company says “I pay you for doing this, why do you complain about it?”’ But think about it. Have you ever been to a pub and drinking beer with your friends? Have they asked you how work is going? And have they applied for a job at your company?

How toils made

I faced several situations when toils are produced and I realized some key situations when these toils are born mindlessly. I think some of these causes are the following:

Constant hurrying: This can happen when work is urgent. This can be caused by a lack of processes, planning too much work for too little time-frame, so lack of measurements. In these cases there is no time to think about other solutions, the job should be just done, as soon as possible and do the pure drudgery.

Doing everything as it used to be done: Typical routine works. Sometimes people do not realize what they do. “Well, I just did, what I had to do/what I was asked for.” Regardless it could be kind of outsourced e.g. with the help of a form and some scripts to the requestor, or automated with a script.

Minimum valuable product requested: Sometimes if you think a little bit more and make more engineering work, you can make these types of requests cheaper and you can also make a better architecture that is more stable. Also, sometimes stability is a better choice rather than poorly designed but minimal systems.

Not thinking forward or lack of engineering work: Thinking in the long term can lead to better results. Like more stable systems, less paging, less clickety-click work, more time to deal with real problems. Also, sometimes looking for up-to-date solutions can be a better solution, can make the work more exciting and you can improve yourself.

Toil depends on other teams too or caused by other teams: That’s the worst of all, because it’s maybe not just your problem, but other teams too. Also, this type of toil is the hardest to eliminate. It only can be solved with a lot of communication between teams and making clear what bugs you, following by strong cooperation.

All in all,

beware of toils. They are bad for you, and bad for the company. And if you find one, eliminate if you can.