Why Is Your Team Unable to Deliver on Time?
The Problem Statement
“Software never gets delivered on time,” is a complaint that I’ve heard many times from business leaders, product owners and technology leaders. Often this refrain is accompanied by the plaintive cry that “we used to be much better than we are”. Often they have tried many things to make the situation better but to no avail. Sometimes it even feels like the harder they try to improve things, the more resources they put into making deliveries work, the more they get delayed. How does this happen and what can we think about doing to help the situation?
Understanding “On Time”
Before discussing why the development teams are not delivering on time, it is important to discuss what “on time” means.
It is possible that development teams are not being “slow” at all. It could be that unreasonable expectations are being imposed upon them. It could be, for example, that delivery dates for products are mandated by higher management before development teams are even involved. Any organisation should beware of imposing the iron triangle on any development group as it will almost certainly lead to failure of a project or product.
It is tempting for any business to point to a development team and to blame it for late delivery. It is therefore hugely important that for “on time” to be understood, agreed and bought into by all of the interested parties. If a development team is being held accountable for a delivery date that it feels it did not agree to then it will almost certainly not be motivated to meet that date and may well fall back on a blame game because “we never agreed to this anyway!”
Developers and development teams feel more comfortable when asked to predict a timescale for something small whilst business focused people demand to know how long large things will take to deliver as they are often asked to plan and forecast for large periods of time, sometimes in the multiples of years. Resolving this tension is a subject in itself but in general, it should be understood that it isn’t possible, or valuable, for a development group to predict with certainty more than a few weeks into the future. Any definition of “on time” must be flexible enough to take this into account. Rather than hold teams accountable for slow delivery or poor estimates, it is much more valuable to find a way that will work more usefully for all interested parties.
Lack of People
It is possible that a development team could be failing to deliver on time because of a lack of people. Whilst we have known since the 1970s that simply adding extra people to a software development effort will not necessarily reduce the time to delivery it would be wrong to simply dismiss extra resources as a possible solution to slow delivery.
If your delivery team is responsible for different products or its work can be easily parallelised because it doesn’t have hard interdependencies, then it could be that the team simply has too much to do and could benefit from extra people. It should be relatively simple to understand if this is the problem with any given team. Beware though that adding extra people into a team could cause problems with sharing context and it might be better to use the available resources to create two new teams rather than expand the existing one.
Silos and Lack of Alignment
Many organisations naturally evolved around technical competences. This seems perfectly logical and works well up to a certain point. I worked for a startup from 2005 until 2015 during which time we grew from five technology focused people to around three hundred. Originally we had a CTO, two developers (of which I was one), a database person, a designer and three non-technical people (CEO, salesperson, and business manager). As we hired more people they fitted into what was a single cross functional team, although “cross functional” was never a phrase we used.
As we grew larger, it seemed natural for the database person to become the manager of the new database people and then later on the developers were split into two groups, one responsible for the website and one responsible for “everything that isn’t the website”. I was the leader of that second group. So we had by this stage three technology focused teams. This still worked reasonably well up until the point that we grew big enough so that it was no longer obvious and visible to everybody what everybody else was doing at any given point.
This was a strange thing to observe and I can’t say that it was obvious when the moment came. What I now know is that we ultimately started suffering from the malign effects of Conway’s law on our ability to deliver solutions to our users. The value stream was now fragmented between several teams, each with its own goals, priorities and incentives. We now had silos in our technology delivery capability meaning that we had handovers of work between non-aligned teams. The result of handovers is queuing time. Queuing time is waste, pure and simple. It doesn’t matter how “busy” or “utilised” different parts of the value stream are if there are handovers and queues, that queuing time will almost certainly be the biggest factor in your overall delivery time.
Lack of Knowledge, Experience or Expertise
Norm Kerth’s Agile Prime Directive should be read at the beginning of every retrospective meeting. It tells us that:
“Regardless of what we discover, we understand and truly believe that everyone did the best job they could, given what they knew at the time, their skills and abilities, the resources available, and the situation at hand.”
This is an important message and it should always be considered that a team may simply be lacking in “skills and abilities”. Do not discount this as a possible cause of the inability to deliver on time.
General Skills and Experience
Are the developers in our teams capable of delivering the outcomes that we are asking them to deliver? Do we have the necessary experience to understand how to own a delivery process? It is important to understand, articulate and align on the skills that are needed in every team and to have an open and honest conversation around whether the group contains those skills. If you find that your team does not contain the requisite skills or experience then that is the problem that should be tackled.
Do you have a very competent team that has insufficient knowledge of the domain in which it is being asked to work? It could also be possible that an individual in your team is holding all of the domain knowledge in a personal silo. In the former case, you need to examine what kind of product ownership of analysis capability you need to introduce and in the latter case, you will need to examine the mechanisms your teams use to share context internally. For example, are they pairing regularly on all aspects of the delivery, from story creation through development to deployment?
Many organisations, particularly more mature organisations, have control mechanisms in place that was originally designed to support an outcome in an entirely sensible fashion given the prevailing norms of technology and business at the time they were conceived. The problem with many such processes is that they no longer make sense in the modern world of Agile delivery and in some cases they can actively impede value creation. Such processes are sometimes known as Risk Management Theatre.
A good example of this is the traditional change management board or team. In the days when software was delivered on long cycles, perhaps years or more, and changed rarely after initial delivery, it made sense to carefully analyse and record the impact of changes and the potential risks. In the modern Agile world of fast feedback and baking quality in such processes not only do not make sense but actively work against getting value to your customers as quickly as possible.
It is possible that your organisation has many such small processes in place that no longer serve their original purpose and possibly even increase the risks that they were originally intended to mitigate. A good way to identify such processes is to ask people involved in process oriented work what outcome their process supports. If this is a hard question to answer, or that outcome is supported earlier in your delivery cycle, it could be that the process is not needed at all and could be abolished.
A big vector for slow delivery is the accumulation of technical debt. This is a concept that has been talked about since at least 1992 by Ward Cunningham. The problem is that product owners often do not understand why they should care about it. A myth has built up over the decades that developers want to do things right just to please themselves. The truth is that every missing test, every piece of badly written code, every wrongly named method, every abuse of sensible design standards, has a cost. This cost is generally small in each case but you pay a heavy price in time because the cost of each, like real monetary debt, compounds. Code quality should not be regarded as “gold plating”, rather it should be regarded as the oil that makes the engine run smoothly.
It is important to understand if you are suffering from technical debt and if so to do something to tackle it. Tackling technical debt is not for the benefit of software developers, it is a necessary maintenance task that will help us deliver software on time today and in future. There are many tools available that can be used to analyse codebases, identify code smells and bad design and suggest improvements. Such tools can give objective metrics that can be tracked to help you understand if technical debt is an issue that should be tackled.
Ideally, any given product managed by a good delivery team will not be allowed to build up high levels of technical debt. But if it does, you need a strategy to tackle it. Simple strategies such as the boy scout rule or technical debt days can help but ultimately prevention, as with many things in life, is better than cure.
In a modern DevOps mindset, the development team takes responsibility for developing and running the application that they produce. Part of this end to end stream of value is deploying the application into some infrastructure. Slow delivery can often be caused by a lack of automation in the processes associated with deployment. In the worst cases, this could mean that you have on-premise servers hosting your software and the act of deploying software to them involves an individual or group of people following a list of manual tasks to get the latest version of the software running on the servers.
The ideal state for deploying software should be that you have automated pipelines that can deploy your software at the touch of a button. Furthermore, not only should the act of deployment be automated and therefore repeatable but the infrastructure on which it is deployed should be itself created or modified through scripts that are run through such automated pipelines. Your goal at every level should be to automate all the things that can be automated. Any manual, repetitive task is not only a waste of time but engenders a massive risk of failure.
Metrics and Throughput
Finally, and perhaps perversely, the methods by which your organisation measures the productivity of your development teams could cause them to deliver slower. There is a not unnatural desire within most management paradigms to understand how well teams and people are performing. Eli Goldratt once said, “Tell me how you will measure me and I will tell you how I will behave”. It is entirely possible that the measurements you are taking are causing the very problem they should be helping to solve.
The only sensible way to measure a team that makes something is to understand throughput. Throughput is defined as the rate of return of value to the company or its customers. The problem is that it is virtually impossible to understand throughput in a software delivery organisation because it is impossible to assess the value of epics, features, stories or tasks. This leads managers to devise proxies for throughput such as (the abuse of) story points, stories completed, lines of code, bugs fixed or anything else you may think of. The trouble with such proxies is that they are not leading indicators of the value returned and they can be gamed easily.
It could be that you can create a sensible, meaningful, game-proof metric that is a good proxy for throughput. It could even be that you can exactly calculate throughput. In my experience, this is rare to non-existent. The best way of assessing value, and understanding whether teams or the whole organisation is performing well, is to measure the four key metrics and iterate on their improvement. These metrics - Lead Time, Deployment Frequency, Mean Time to Restore and Change Fail Rate - have been shown to be an indicator of high performance in delivery teams and organisations.
There are many different possible reasons why teams could be perceived to be persistently unable to deliver on time, each with its own solution. My golden rules to follow when diagnosing and attempting to fix slow delivery are:
Understand what “on time” means. Make sure all interested parties are aligned on what “on time” means and are bought-in to any agreed dates.
Understand if you have handovers and consider reorganising your business so that you have aligned cross functional teams capable of and empowered to deliver on the entirety of their customer outcomes.
Don’t discount simple explanations.
Do we have enough people?
Do we have the right skills?
Care about technical debt and make sure that everybody knows that it is everybody’s problem if technical debt accumulates.
Automate all the things.
Only measure meaningful things, if it isn’t possible to measure throughput, consider tracking the four key metrics.