Ideal Iteration Length

Ideal Iteration Length – A survey

Recently I put the question of the rationale for a max sprint length of 30 days to one of my LinkedIn groups. Here are the responses:

  • The idea is that anything over 30 days is too large to effectively break down and estimate properly and for everyone to keep that model in their head. It also keeps you focused on the quick evolution of the team to learn through regular small mistakes, instead of remembering what went wrong months ago.
  • The idea is to fail fast and more than 1 month is not fast enough
  • Short sprints and small user stories provides the ability to test early, deliver usable functionality incrementally in small batch sizes, with lower risk.
  • Delivering working software in every sprint over 30 days is called a project. If you keep in mind that the content of the sprint is frozen and that a scrum team should be around 7+-2 people you freeze 7-9 man months or more of work in advance without allowing the customer or product owner to have a say or see a working product in between. This is a huge investment with high risk of error. It also prevents you from re-evaluating your way of working often and to review your failures and learn from them continuously.
  • At the beginning, the requirements for that Sprint have been frozen. So the question of Sprint length has also to do with the ability of the Product Owner and Scrum Master to keep these requirements untouched by external influences and to negotiate new requirements onto the Product Backlog for review on the next planning meeting. If you are getting a lot of ‘this just can’t wait’, then shorter Sprints are better.
  • You lose the pressure if you start a sprint longer than 30 days. I’ve experienced a 10 days sprint as the best. Try a discussion about changes with the team after you’ve estimated all tasks.
  • 10 days is the right amount. In addition, this helps with user exceptions for urgent changes or issues. It’s an easier sell if I tell them this sprint is set (or even the next one). At the most they typically only have to wait 3 weeks for us to get started on something new for them.
  • A shorter sprint duration reduces risk. I would add that shorter sprints force more frequent synchronization and convergence of the different work streams occurring within a project. If there is any incompatibility between, say, modules of code it would be discovered sooner (and presumably be fixed sooner) thus reducing the risk of having a much bigger incompatibility problem that might arise from longer sprints. It’s better to fail fast because big problems discovered later than sooner are a recipe for blowing the schedule.
  • We have tried two week sprints and three weeks sprint and the feeling was that three weeks was the perfect length. When we did two week sprints it felt like we were always starting or ending and the teams stress level was too high.
  • This also comes down to how the human brain works. If you have a term paper due at the end of the semester, you tend to start working on it at the end of the semester. If you had an outline due three weeks into the semester, you’d start working on your Term Paper two weeks into the semester. Slicing into chunks (sprints) helps to prevent all the work piling up at the end. This goes straight to the Agile Principle of “promote sustainable development.”
  • My Agile coach always recommends starting with one week sprints for new Scrum projects. When you are just starting, how you use Scrum is as much or more important than what is being built. With one week sprints you have a shorter time between learning cycles to adjust things like estimating, how many story points the team takes on and so on.
  • There are three reasons. Two are explicit in Scrum the other isn’t mentioned, but is one of the foundations on which Scrum is based. The explicit ones are feedback and enforced view of reality. The second is removing delays.
    • You want quick feedback. Longer than 30 days does not force you to get it. But feedback exists in many forms. There is the time from getting a request until delivering it. Time from starting a story until it is complete. This is where lessons are learned quickly. I personally don’t like to fail fast, I prefer to learn fast. One does not need to write code to discover the customer doesn’t want it, one needs to write a little where the customer is clear, show it to them, and enable the customer to learn a little bit more. But, if you are going to fail, do it quickly.
    • By enforced view of reality, I mean that things will show up to make it difficult to deliver value in 30 days. For example, customers may not want to talk to you that often, build processes may be inefficient, testing and coding may be too separated from each other. By having a time-box in which your work is intended to get completed, those things that work against you will become more visible and painful. These are impediments to your work actually, and instead of avoiding them by making the sprint length longer, you actually need to shorten the sprint to expose these even more. It is also harder to ignore them – if you haven’t tested your code by sprint’s end – it’ll be obvious and irrefutable. Both of these reasons actually tend to call for even shorter than 30 day sprints. Most successful Scrum teams I know of use 1-2 week sprints. In fact, I’d go so far as saying teams with 3-4 week sprints are showing a “smell” of not being engaged enough in solving their problems (not always true, but often true).
    • The third reason, which in many ways is the biggest one, is removal of delays. Since about 2004, I have been claiming that Scrum is a weak implementation of Lean principles. I say “weak” because Scrum does not deal with optimizing the whole and it leaves out a lot of lean-education and management. But one of the key tenets of Lean is to remove delays so value can be delivered quickly. This is critical because delays between steps of work literally create more work. The time-boxing of Scrum requires teams to complete their work quickly. This all reduces delays and therefore avoids incurring work that doesn’t need to happen.
    • The combination of feedback, awareness and removal of delays drives us to have shorter feedback loops until the overhead of the sprint overcomes the value achieved by them. For most teams this will be 1-2 weeks. Some teams that discover they don’t need the discipline of the time box will abandon it completely and move to Kanban. I might add that a simple value stream analysis will show most that “the shorter Sprint the better”. Scrum contains no technique or method for optimizing end-to-end, and it should not. The retrospective might uncover such a problem, but I generally advise to use Lean thinking to address end-to-end optimization explicitly.
  • 30 days is also fairly typical ‘management reporting interval’. A Sprint longer than 30 days means that management may not get a ‘status update’ for two months.
  • With experienced teams and a well-defined product backlog, a 30 day sprint may be fine (not my preference). But when the teams are newly formed, new to Scrum or when the product backlog is very dynamic, it’s better, as someone pointed out, to fail earlier and adapt sooner.
  • A two-week sprint is my preference. Just long enough to develop some rhythm and velocity, but not so long that you risk going down the wrong road for a month.
  • 30 days matched traditional development teams that were new to scrum, or where older technology was not nimble enough for rapid development for all the mentioned reasons especially quick review and feedback. Even with a 30 day sprint cycle, I have usually obtained feedback in shorter cycles before being fully accustomed to scrum. Maybe as technology and teams get more progressive we will see shorter sprint cycles.
  • All above answers are great. I will amend that by having frequent and not too far away reviews that show what was built (the increment) then you are being transparent and providing the visibility to the stakeholders, so everything that was mentioned (risk, done, value) are all observed and demonstrable. Also by repeating the cycle and having a time for inspect and adapt you can become agile. If you do them more than 30 days the people may not remember what happened and will not effectively adapt. BTW 30 days is too much for us, my experience that most teams use a 2 week (10 days) sprints.
  • Start with sprints as short as possible… one week or two week sprints. Do not overpromise the product owners but in first place under-promise as you need to learn on what you can deliver on shippable products. If one week or two week sprints are not possible to deliver shippable products you can extend the dime of a sprint to for example 3 weeks or 4 weeks.
  • A 2 week sprint has effectively just 9 days to build and test deliverables as you also need to reserve time for backlog grooming, sprint planning, sprint review and retrospectives. When you start your first sprint just under-promise.

The Lean Roots of Agile

Agile software development has its roots in the lean manufacturing paradigm developed at Toyota – the Toyota Production System (TPS). Lean manufacturing centers on creating more value with less work. One central principle of lean manufacturing is to relentlessly work on identifying and eliminating all sources of waste from the manufacturing process, and also to eliminate all obstacles that impede the smooth flow of production – usually done using a lean technique known as Value Stream Mapping.  ‘Waste’ can take many forms but it basically refers to anything that does not add value from the perspective of your customer. Taichi Ohno, an executive at Toyota, identified seven categories of waste:

  • Defects
  • Overproduction
  • Inventories (in process, or finished goods)
  • Unnecessary processing
  • Unnecessary movement of people
  • Unnecessary transport of goods
  • Waiting

Central to the lean philosophy is making improvements in real time: identify a problem, measure it, analyze it, fix it and apply it immediately on the factory floor (or in the next iteration) – don’t wait to read about it in a report later.

It is fairly easy to come up with software examples for each of the above categories of waste (example: reduce ‘Waiting’ by eliminating hand-offs, or reduce ‘Overproduction’ by cutting out all unnecessary features). Lean principles will thus sound very familiar to agile practitioners. In their book, Lean Thinking, the authors Womack and Jones, identified five principles of “lean thinking.”

  1. Value (Definition of): Specify value from the customer perspective.
  2. Value Stream: Analyze your processes and eliminate anything that does not add value for the customer.
  3. Flow: Make process steps flow continuously with minimal queues, delays or handoffs between process steps.
  4. Pull: Build only in response to specific requests from customers. In other words organizations should not push products or services to customers. Order from suppliers only the material or services required to supply specific customer requests.
  5. Perfection: Continuously and relentlessly pursue perfection by reducing waste, time, cost and defects.

There is no clear written description of Toyota’s practices, however there is a lot of written material about TPS  – most notably  The Machine That Changed the World, by Womack, Jones and Moos – now a management classic.

Several authors have boiled down the definition of Lean into Two Pillars:

  • The practice of continuous improvement
  • The power of respect for people

Continuous Improvement means the relentless elimination of waste in all it’s forms, and, the identification and removal of anything that disrupts the continuous flow of the process.

“Respect for people” is a pretty vague term, but at Toyota it means something very specific, including providing people with the training and tools of improvement, and motivating them to apply these tools every day. At Toyota they would say: “We build people before we build cars”.

To summarize TPS even further in a single sentence: Toyota practices continuous improvement through people.

Optimum Batch Size

From the late 1940s Toyota was experimenting with batch sizes and in particular with die changes associated with stamping out steel metal parts for cars. During this time the important discovery was made that it actually costs less per part to make small batches of stamped parts than – unlike their American competitors – to produce enormous lots on a large scale. There were two reasons for this: small batches eliminate the need to carry huge inventories of parts required for mass production, and making small batches means that any defects are discovered much earlier. The implications of the latter issue were enormous – and have a direct correlation with agile software development – it made those responsible for stamping much more concerned with quality, and it eliminated re-work and the waste associated with defective parts.

For more on the history of Lean Production, I refer to readers to The Machine That Changed the World, the essential reading reference on this topic. In the book: The Principles of Product Development Flow, Donald Reinertsen sets out a dozen ‘principles’ that promote the merits of smaller batch sizes. In the context of software development, Waterfall executes a software release in a single batch, whereas scrum breaks up the requirements for the release (the Release Backlog) into a set of smaller batches, and implements these in a sequence of Sprints. The question we have is: what is the optimum size for a sprint?

Much research has been done on quantifying the optimum batch size in the context of manufacturing logistics. The calculation of ‘optimum’ batch size – usually called Economic Batch Quantity (EBQ) – is based on the relationship with demand, machine set-up cost, and inventory carrying cost per item.

Specifically:

EBQ = Sqrt(2*Demand*Setup Cost/Inventory Holding Cost per Unit)

Where the inventory holding cost is the overhead cost associated with each item, and generally increases with the size of the batch, and the setup cost per item generally decreases as a function of batch size. This can be illustrated in the following chart:

Optimum Batch Size
Optimum Batch Size

For manufacturing operations the impact of large batch sizes on cost and throughput may be fairly obvious: large batches increase work-in-process inventory levels, materials lead-times, and finished good stock levels. But more importantly, large batches drive more variability, and more variability means more re-work, more cost and delays.

We will look at exactly what this means for software, but before that, what we really care about is maximizing throughput, i.e. how do we maximize the number of User Stories (or Story-Points) delivered per sprint – known as the velocity in scrum? The throughput chart is simply the inverse of the cost chart above, and will be an inverted U-Curve.

Let’s separate the production costs of User Stories into direct costs – those that contribute directly to the definition, design, coding and testing of each story, and indirect costs – those that represent overhead, or non-value adding activities.

Direct Costs

Direct Cost per User Story
Direct Costs per User Story

Indirect Costs

Indirect Costs Per User Story
Indirect Costs Per User Story

The fixed costs are associated with the fundamental development tasks that must be carried out to deliver the set of User Stories in the iteration: design, coding and testing. Building 100 user stories in a batch as opposed to building 1 at a time yields obvious economies of scale. (Think of test automation as an example). However, as the size of an iteration goes up, so too does the amount of indirect cost, examples of which include:

  • Backlog grooming – getting enough stories ready
  • Sprint planning
  • More meetings
  • More documentation
  • More builds
  • More bug-fixing

The simple fact of having to deal with an iteration of larger scope and complexity drives up the amount of overhead one encounters in a project. Thus the overall cost/story vs. iteration size graph will be quite similar to that in Figure 1.

Another way to visualize this is to consider resource utilization as a function of queue (or batch) size. As the batch size increases the overhead costs increase in a non-linear way and we end up in a situation like the following:

Queus Size vs Capacity Utilization
Queue Size vs Capacity Utilization

The larger the batch, the fewer opportunities there are to inspect and adapt, and the more we are exposed to WIP and re-work. It is up to each team to find their own sweet spot for the size of an iteration. In scrum, we talk about getting to an ideal ‘velocity’, or the optimum number of Story Points per Sprint. Teams should relentlessly pursue this objective by searching for and eliminating all sources of waste and delay in their development process.

Another good example of small batches in action is described in Lean Thinking, by Womack and Jones. The book describes an experiment to stuff envelopes with newsletters, where each envelope had to be stuffed with a newsletter, then stamped, addressed and sealed. Intuitively one might have thought that first stuffing all of the envelopes, then adding stamps, and so on, that is, in a large batch, would have been a more efficient approach. However, the experiment demonstrated that processing one envelope at a time – stuffing, stamping, addressing and sealing – was overall more efficient. This was because, the effort associated with the overhead tasks of sorting, stacking and moving around large piles of partially complete envelopes, proved more costly in the end than processing one at a time. The winning approach is an example of the ultimate small batch – one piece flow.

For teams beginning iterative development or scrum I would however very strongly encourage them to start small, making the iterations and number of stories committed as small as possible. The primary focus should be on first getting the process right – deliver production-ready code as the output of an iteration. I know teams who have been practicing scrum for years and are stuck in a mode where each iteration delivers only partially completed stories, not fully tested or defect-free. The issues causing this situation are easier to resolve when the focus is not on committing to the maximum number of stories, but rather on streamlining the process itself. Once that is done, velocity improvements will come much more quickly.

Print Friendly, PDF & Email