“If you can’t explain it simply, you don’t understand it well enough”.
Scrum is one of the most successful frameworks for agile software development by teams of 5-9 people. Scrum provides a structure that includes roles, practices and artifacts whereby teams can build shippable product increments in short iterations called sprints. Many enterprises however frequently struggle to scale scrum beyond the individual team level to support large, multi-team projects. Since the construction of much real-world commercial software requires multiple teams, the scaling challenge is both very real and pervasive. A number of frameworks have emerged over recent years to tackle the scaling challenge, and Dean Leffingwell’s Scaled Agile Framework (SAFe) has established itself as one of the most popular and widely adopted. SAFe is not simply about doing scrum on a large scale, but defines additional practices for each of the generic operating levels of an enterprise: Portfolio, Program and Delivery Team levels. SAFe is not without its criticisms, not least of are its large-scale, up-front planning process, its top-down approach and perceived emphasis on process over people. Nonetheless, SAFe provides a comprehensive and well-defined solution to the scaling challenge. Hence its appeal to large organizations and the risk-averse executives who run them. SAFe is a large, complex and evolving framework, and although it is well-defined in terms of practices, roles and artifacts, there is no single implementation strategy, and organizations are instead encouraged to engage the help of SAFe Program Consultants (SPCs – trained and certified by Scaled Agile Inc.) to provide implementation guidance and coaching. Frequently though, organizations can lose sight of the fact that SAFe, like scrum, is a framework – not a prescription, and that it absolutely must be tailored to meet the unique needs of every organization. Additionally, organizations should not attempt a wholesale adoption of the entire framework – this is a recipe for becoming overwhelmed in complexity, risk and failure. The key question is – how and where to begin, and how to incrementally add practices until a robust level of adoption is achieved? Is there a way to distill the framework into a simple baseline pattern, or template, that can be used as a starting point but still solves many of the practical challenges of scaling?
Organizations that have adopted scrum may quickly discover that additional practices are required to align delivery teams with business strategy in order to ensure the accurate and efficient flow of knowledge across the organization and to maximize the delivery of value to customers. This is precisely the challenge that SAFe seeks to address, and one of the pillars of the ‘SAFe House of Lean’ is the principle of flow, which is defined as ‘Optimize continuous and sustainable throughput of value‘. This is accomplished via a sequence of interconnected Kanbans that operate continuously, and feed delivery teams with a steady flow of ready work. Kanban is the simplest of all agile methods, and can be summarized in a handful of rules:
- Visualize The Work (using a Kanban board)
- Limit Work-in-Progress, Actively Manage Items in Progress
- Make Workflow Policies Explicit
- Improve Continuously and Collaboratively, by removing waste and improving cycle time (flow).
An excellent reference for Kanban is Kniberg and Skarin’s book: Kanban and Scrum, Making The Most of Both.
In large organizations with business portfolios comprising multiple products, multiple levels of requirements abstraction may be needed, with different roles and ceremonies operating at each level. In the SAFe (4.X) model for example, we have at least 4-level requirements hierarchy: Epic >> Capability >> Feature >> Story. Capability was introduced in SAFe 4.0 to support those cases where Program Increments, or PI’s, needed multiple release trains to produce – i.e. really large programs. The vast majority of organizations should be able to operate with 3-level SAFe (Epics, Features and Stories). These items are managed as artifacts of solutions >> programs >> iterations, respectively. Within the 3-level operating model, each level has its own set of practices, roles and artifacts. There’s our first simplification – if your organization produces product releases with 5-12 delivery teams (< 125 people), then you only need 3-level SAFe, which can be summarized as follows:
- Portfolio Level: Epic Definition & Refinement
- Program Level
- a) Feature Definition & Refinement
- b) PI Planning
- Team Level: PI Execution
- Portfolio Level – The overall goals at this level are to maximize the financial performance of the business portfolio, and to provide a continuous flow of value to the customer. What products/solutions are needed to deliver on the business vision, and what are the relative priorities (and corresponding investment levels) in each case? These items are captured as ‘epics‘, that is, they may require multiple releases (or program increments) to realize the vision and goals for that product. Epics are basically ‘containers’ for everything needed to provide a solution. Each epic should have a vision, roadmap and MVP definition (minimum feature set). Epics in the business portfolio are refined via an intake kanban (e.g. Funnel >> Review >> Analysis >> Ready).
- Program Level – The primary goal of the program level is to maximize the delivery of value via features and stories. How do we best organize multiple delivery teams to quickly and incrementally deliver meaningful subsets of each epic to the market? To do this, an epic’s MVP feature set is refined further via story mapping and story estimation. Multiple teams operating in a synchronized cadence may be needed to incrementally deliver features related to the epic. Team velocities are used to estimate the scope of what can be delivered in program increments.
- Team Level – A set of practices that support the further refinement of user stories into sprint-sized pieces that can be delivered – typically using scrum – consistently and with production-level quality.
It’s useful to think of this process as a cascading sequence of transformation patterns that operates continuously and delivers progressive refinement of business requirements until discrete items of value can be accepted by delivery teams for implementation.
For this process of continuous refinement to run smoothly, we need a high degree of alignment between an organization’s business and IT functions. (Establishing that may be a bigger challenge than the adoption of a set of mechanics for scaling agile practices). Here is another view that may help explain the relationship between Epics, Features and Stories:
- At the portfolio level, epics are elaborated into features that describe the intended capabilities of the product or value stream. (not detailed feature definitions at this stage).
- At the program level, features are fully defined in terms of benefits and acceptance criteria, and initial user stories are proposed (headline-level detail only).
- At the team level, user stories are refined further (via backlog grooming/refinement) until they are sufficiently well-defined (INVEST) to be ready for implementation via sprints.
In practice epic and feature refinement for the next PI runs concurrently with current PI execution.
Definitions of Ready, both at the epic and feature levels, have been proposed by the SAFe framework, in terms of templates, here and here. Epics require a vision (in elevator pitch format) and a list of required features/capabilities. For features, it is recommended that a list of benefits (why do we need this feature) and acceptance criteria are required for each one. The output of one set of practices feeds another via backlogs of ready work, often realized as a series of interconnected kanbans.
The preceding has been a high level summary of the steps required to get work defined, planned and delivered from business portfolio to working software. The question now is – what mechanics or governance mechanisms do we need to support these processes in a simple but consistent fashion. Each of the 3 major operational areas comprises a set of practices, artifacts, and roles. In what follows we will elaborate on each of these 3 parts of the framework.
SAFe at its Simplest
For those who do not have the time (or the patience) to wade through SAFe’s arcanery, here is a simple way to think about scaling up to a program level with multiple teams. First, we are all very familiar with scrum, which is frequently represented as follows:
Think of a SAFe program as a higher level abstraction of scrum, following an identical pattern of: Planning – Executing – Inspection/Adaptation – Repeat.
The details (roles, ceremonies and artifacts) are of course different, but the essential process flow and general philosophy behind it are exactly the same.
Portfolio governance refers to the management of business initiatives that require work at the program and team levels. Specifically:
- Deciding which initiatives to undertake and in what order
- Making changes to in-flight initiatives
A portfolio kanban system is an effective way to manage this process and track progress. The portfolio management team (epic owners and the enterprise architect) drives this process via regular portfolio management (or portfolio grooming) meetings. Typical governance tasks include:
- Review epic definitions
- Review epic business cases
- Identify architectural enablers
- Determine priorities, estimates and epic rankings (e.g. using WSJF)
- Update the kanban board
The portfolio kanban has the following states:
- Funnel – initial state for all new ideas/proposals pending review and analysis
- Review – Epic defined in terms of elevator pitch (vision statement) and key features identified
- Analysis – One -page business case, solution alternatives, enablers, relative ranking (WSJF), and Go-No-Go decision
- Portfolio Backlog (Approved Epics)
Program Governance and Feature Intake Process
The critical activities at the program level include the following:
- Portfolio Alignment: Ensuring the work being defined for the delivery teams aligns closely with the business portfolio, the roadmap and vision.
- Release (PI) Planning Readiness: Features in the program backlog have been defined in sufficient detail to support user story creation and estimation. A feature intake kanban is usually used to manage this process.
- Release (PI) Planning Event: The event successfully achieves alignment between business owners and the program teams to a common, committed set of Program Objectives and Team Objectives for the next release time-box.
- PI Execution: The teams successfully deliver a large percentage of the committed objectives.
- PI Retrospective: An end of PI retrospective is held to give program teams the opportunity to inspect and adapt, and thereby improve their effectiveness over time.
A key practice at the program level is to get features sufficiently elaborated, refined and prioritized that they can be tackled by delivery teams.
Feature intake can be managed using a kanban with a work-flow that clearly shows the status of feature requests as they are fleshed out from one-liners into fully defined program backlog items with business benefits and acceptance criteria. For example:
In this model a very simple feature intake workflow is used with only 2 states: Funnel and Committed. In between those 2 states something that came in as a one-line idea needs to have been fleshed out with a detailed description with testable goals and other constraints. We could of course adopt a more elaborate set of workflow states to identify things like: Feature Ranking complete (business value reviewed and agreed), feature release assignment complete, architecture review complete, story mapping complete, and story sizing complete. The number of states in the workflow would need to be tailored to the needs of the organization.
Here’s a more detailed intake workflow:
These state transitions could be done in a series of, say, weekly meetings e.g.
- Business review – (attended by product owners)
- Architecture review (to outline solutions and identify which system components need to be worked to support the feature) – attended by architects and SME’s
- Sizing – architects and SME’s
- Ranking – architects and SME’s
By having these discrete states in the intake workflow the readiness of the feature for actual development is clear. The feature intake kanban can be used to visualize the process with appropriate WIP limits set for each state.
Release planning sits mid-way between feature intake and feature delivery, and serves to keep the feature delivery pipeline filled with sufficient work to keep delivery teams fully supplied with work. With the above feature intake model we have simplified release planning by having story mapping and sizing done before the planning event. Planning is now exercise to reconcile dependencies between stories, sequence stories into sprints, and identify the total product scope that can be targeted in for the next release timebox. For more details see here. The output of the planning event should look something like the following:
Team Practices/Feature Delivery
Once features have been elaborated into user stories with acceptance criteria they are ready for pulling into a sprint for delivery. The de-facto agile delivery framework is scrum (although kanban, or scrumban, is also a popular approach). Scrum comprises a set of roles, ceremonies and artifacts, as summarized in the following diagram:
Many teams will supplement scrum practices with a set of ‘technical practices’, including TDD, BDD, test automation, continuous integration and so on. These are considered an essential prerequisite to getting a production quality product increment out of every iteration.
A typical continuous integration configuration is summarized in the following diagram.
In this system we have setup a CI system such as Hudson – an open source CI tool. Hudson integrates with other CI-related tools from multiple vendors, such as:
- SCM Systems: Perforce, Git
- Build Tools: Maven, Ant
- Unit Testing Frameworks: Junit, XUnit, Selenium
- Code Coverage Tools: Clover, Cobertura
Hudson orchestrates all of the individual sub-systems of the CI system, and can run any additional tools that have been integrated. Here is a step-by-step summary of how the system works:
- Developers check code changes into the SCM system
- Hudson constantly polls the SCM system, and initiates a build when new check-ins are detected. Automated units tests, static analysis tests and functional regression tests are run on the new build
- Successful builds are copied to an internal release server, from where they can be tested further by the development team, or
- loaded into the QA test environment, where independent validation of new functionality can be performed at system level.
- Test results are reported back to the team
Knowing that every change made to an evolving code-base resulted in a correctly built and defect-free image is invaluable to a development team. Inevitably, defects do get created from time to time. However, identifying and correcting these early means that the team will not be confronted with the risk of a large defect backlog near the end of a release cycle, and can be confident in delivering a high quality release on-time.
Setting up a continuous integration system is not a huge investment, and a rudimentary system can be set up fairly quickly, and then enhanced over time. The payback for early detection and elimination of integration problems and software defects dramatically outweighs the costs. Having the confidence that they are building on a solid foundation frees up development teams to devote their energies into adding new features as opposed to debugging and correcting mistakes in new code.