Agile Software Engineering

SAFe Light - Part 3: Beyond Story Points

Alessandro Guida Season 1 Episode 34

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 23:52

Send us Fan Mail

In this episode of The Agile Software Engineering Deep Dive, Alessandro Guida continues the SAFe Light series with a discussion about estimation.

Estimation is often treated as a promise, even though it is by definition an approximation made under uncertainty. This episode explores why false precision creates mistrust, why story points and velocity can hide uncertainty, and why better estimation starts with better understanding.

The episode introduces a practical SAFe Light estimation model: no-estimate for small low-risk work, lightweight breakdown when teams need grounded understanding, WBS-based estimation when predictability matters, and exploration before estimation when the work is still too unclear.

The central idea is simple: estimation should not create false confidence. It should create understanding, expose dependencies, reveal architectural risks, and support better decisions.

Support the show

This Podcast is an audio version of the written Agile Software Engineering newsletter.  If you want to go deeper, don't forget to subscribe the newsletter too.

Welcome to the Agile Software Engineering Deep Dive, the podcast where we unpack the ideas shaping modern software engineering. My name is Alessandro Guida, and I've spent most of my career building and leading software engineering teams across several industries, and today I want to continue the discussion about safe light. In the previous episode, I talked about evolutionary architecture and why lightweight scaling only works when the architecture allows teams to move independently. But there is another topic that quietly shapes almost every delivery conversation. Estimation. Few words create more misunderstanding in software development. An estimate is not a promise. It is not a contract. It is not a prophecy. It is an informed approximation made under uncertainty. And yet, in many organizations, estimates are treated as if they should have been precise from the beginning. They become milestones, budgets, dashboards, velocity calculations, and delivery expectations. Then, when reality changes, the team is often measured against a number that was uncertain from day one. That is where estimation becomes dangerous. Not because estimation is useless, but because we ask it to do something it cannot do. In this episode, I want to explore how estimation can work in a safe light approach. Not as estimation theater, not as false precision, but as a practical way to create understanding, expose uncertainty, discover dependencies, and support better decisions. We will look at when no estimate is enough, when lightweight breakdown makes sense, when WBS-based estimation is needed, and why sometimes the most professional answer is simply we do not know enough yet. Let's dive in. Picture this. Exactly. And your boss, or maybe a major stakeholder, leans in and asks that dreaded question. So exactly how long is this gonna take? Oh man, instant stress response, like the fight or flight just immediately kicks in. Right. Your heart rate spikes because you know you don't have all the details. You know the terrain is gonna change the minute you actually start. But they're just looking at you, pen in hand, ready to write down a date in wet ink. Yeah. Today we are going on a deep dive into exactly that moment. We're pulling from issue number 34 of an agile software engineering newsletter. Specifically, we're unpacking a piece called Safe Light Part Three, Beyond Story Points. It is honestly a phenomenal piece of writing because it forces us to confront this deeply uncomfortable truth about how we actually work. Yeah, our mission today is to figure out how we can estimate complex work without falling into the trap of, well, what the author calls false precision. Right. And to discover why hitting a deadline isn't always proof that your estimate was actually a good one in the first place. Now, real quick, before we get into the weeds, I want to take a second to talk directly to you, the listener. You are definitely going to want to check out the show notes and click the link to read and subscribe to the original full newsletter. Yeah, the illustrations in this one are super helpful. They really are. The author includes these incredibly helpful graphics and even more in-depth content that we just, you know, we can't physically transmit through your audio speakers. It's definitely worth having open while you listen. For sure. Also, while this audio deep dive is entirely free, it is a massive help to us if you take just a second to press like on this deep dive and subscribe so you receive all our future issues. It genuinely helps us keep bringing this content to you. And honestly, the framework in this specific issue, it could save you a lot of professional heartache. Okay, love and pack this. Before we can fix our estimation methods, we have to understand why our current expectations of estimates are just entirely broken. I mean, we all know an estimate is just a like a probabilistic guess based on today's incomplete data. Aaron Powell Right. The dictionary literally defines an estimate as an approximation made under uncertainty. It was an informed guess. Aaron Powell But experienced engineers know that new understanding emerges the second they start writing code. So why does the PMO treat it like a blood oath? Aaron Powell That's the billion-dollar question. Why does it end up on a dashboard as a concrete, unmovable date? Exactly. It really comes down to a fundamental psychological mismatch. It's the mismatch between the people doing the work and the people funding the work. Because management craves certainty. Exactly. They have budgets to balance, marketing campaigns to launch, uh shareholders to appease. So they take this inherently uncertain approximation, this tentative judgment call. And they just strip away all the context? Yes. They strip away the context and write it into project plans as if it were a concrete fact. Suddenly, that guess is a milestone. It becomes a personal performance indicator for the team, which is wild. It is wild. Managers start tracking velocity against these estimates, effectively building these highly professional-looking forecasts out of incredibly weak data. I mean, the source brings in some heavily researched numbers that honestly made my jaw drop. We aren't just slightly bad at this, we are historically terrible at it. Oh, the Fleivjug data. Yeah, Bent Fleibberg. He researched around 16,000 major projects across various sectors, the number of projects delivered on time, on budget, and with the expected benefits. It's almost unbelievable. 0.5%, half of 1%. It's staggering. And when you zoom in on large IT projects specifically, the data gets even darker. Geez, worse than half a percent? Well, look at the McKinsey and Oxford study. They looked at over 5,400 IT projects, and they found average budget overruns of 45%. Wow. Schedule overruns of 7%. And get this, they delivered 56% less value than predicted. I want to pause on that because I think it's crucial. Why are IT projects so much worse? I mean, if we are building a bridge, we can calculate the load-bearing capacity of steel, right? Right. But software seems to defy gravity. The underlying mechanism of software is invisibility. When you're building a bridge, the physical constraints are obvious. Yeah, you can't put the roof on before the foundation is poured. Exactly. But software is infinitely malleable right up until the exact moment it suddenly isn't. The complexity is completely hidden until a developer actually touches the code. So you think you're just adding a button to a screen. Right. You think it's just a button. But you don't realize that button requires changes to a legacy database, an update to an external supplier's API, and a rigorous security audit. Aaron Powell I hear that, but let me challenge the premise here for a second. If the data is this catastrophically bad and everyone secretly knows the numbers are made up, why do we even bother estimating it all? It's a fair question. Like why not just say it's done when it's done? It feels like relying on a GPS that predicts your arrival time to the absolute minute, but completely ignores the possibility of flat tires or, you know, traffic jams or road closures. You're just setting yourself up to be late and angry. That is a completely natural reaction to the failure of estimates. But throwing estimation in the garbage is an overcorrection. Okay, so why keep it? What's fascinating here is estimation itself isn't the enemy. The enemy is false precision. False precision. Yeah. The problem arises when we ask an estimate to provide an illusion of perfect control over an uncertain future. When we strip away the expectation that estimates are contractual promises, their true value actually emerges. Which is what if they aren't predicting the future, what are they actually doing? They are mechanisms for discovery. Discovery. Yes, they force a conversation about cost and risk in the present. Like is feature A worth doing compared to feature B? Oh, I think. If an estimate reveals that feature A will take six months because of massive technical debt, maybe the business decides it's not actually that important. Right. A good estimation process helps us identify when a piece of work is just too large, too risky, or too vague to tackle yet. Exactly. You aren't guessing the future. You are mapping the current reality. Mapping the current reality. Okay, so traditional time estimates fail because they offer false precision. Now, a lot of folks listening are probably thinking, well, duh, that's exactly why Agile software development gave us story points and t-shirt sizing. Oh boy. Yes, the famous t-shirt sizes. Right. We deliberately removed time from the equation. I know developers who love t-shirt sizing specifically because it acts as a, well, a psychological safety net. It removes the pressure of giving a hard date. Aren't we taking away that safety net if we criticize it? We might be taking away a comfort blanket, sure. But we have to be brutally honest here. T-shirt sizing is still estimation. Just with different labels. Exactly. Calling a task a medium or an extra large feels harmless because it uses fuzzier language. It avoids the emotional weight of assigning days or dollars. But it doesn't fix the core issue. Nope. It doesn't actually solve the problem of hidden complexity. It just masks it with a softer label. And what about story points? I mean, I always thought they were supposed to be the silver bullet. They abstract the effort into a relative unit. Doesn't that fix the false precision problem? It does, in a very specific, very narrow context. Like what? The author notes that for a single, stable team working on small, isolated items with no external dependencies, sure, story points can work perfectly fine. But that's almost never reality, is it? Exactly. That is rarely the reality in modern enterprise software. The newsletter focuses on safe light, which is a framework for organizations that have scaled up. So they have multiple teams, external dependencies, legacy components, rigid compliance gates. All of that. So how do the mechanics of a scaled environment break the concept of a story point? By introducing two compounding layers of indirection. Okay, what's the first layer? First layer. The team estimates the work in a fictitious unit, the story point. Right. Second layer. The organization then takes that fictitious unit and attempts to convert it into a delivery expectation using historical velocity. Oh, I see the problem. Think about the math there. You are taking a relative, subjective guess and running it through a historical average that fluctuates wildly based on, you know, technical debt, team holidays, new hires, the specific architecture being touched. So each layer just multiplies the uncertainty. Exactly. A five-point story doesn't mean the same thing to team A as it does to team B. It's like imagine you are asked to paint two identical doors in a hallway. Okay, I like this. From the outside, you look at them and say, okay, painting a standard door is a medium effort, five story points. But behind door number one is a clean, well-documented modern API. Nice and easy. Right. But behind door number two is a crumbling, undocumented legacy database from 2008, guarded by a very angry compliance workflow. I've opened that second door many times. And from the outside, they look identical. But from the inside, they require completely different skill sets and risk profiles. So if we just call them both five points, we aren't estimating. We are, to use the author's phrase, engaging in comparison without understanding. Aaron Powell Precisely. If a team slaps eight points on a feature but cannot explain the underlying technical mechanics that drive that complexity, the estimate is hollow. It's just a number. Yeah. It is fundamentally not mature enough to be used for planning a major release involving multiple departments. Abstract agile sizing fails at scale because it compresses real-world complexity into a single meaningless number. So we're in a bit of a bind here. Single number time estimates give us false precision. Abstract story points are too weak to handle complex scaled projects. Right. What is the pragmatic alternative? Because management is still going to ask, how long is this going to take? The solution is to match the estimation tool to the actual work. Like don't use a sledgehammer to hang a picture frame and don't use a thumbtack to build a deck. Makes sense. The source outlines three distinct tiers of responsible estimation, plus a crucial bonus tier. Okay, walk me through them. Tier one is new estimate, focus on flow. Let me stop you there. Management is not going to accept we aren't estimating as an answer. How does that actually work in practice? Well, you deploy this strictly for small, continuous, low-risk work done by a stable team. Think minor defects, small UI tweaks, routine maintenance. Okay, so the small stuff. Exactly. For these tasks, the cost of getting the team in a room to argue about story points is actually higher than the value of the estimate itself. You track it using a Kanban style approach. So you're measuring throughput instead of guessing up front. Yes. If stakeholders need to see progress, you don't show them abstract points. You show them a simple burn-up chart. Completed tasks versus total known tasks. Oh, that makes sense. This is actually much more transparent for management because it exposes scope creep mechanically. If the line of total tasks keeps moving up faster than the line of completed tasks, everyone can physically see why the delivery is taking longer. You aren't predicting the future, you are managing the flow of the present. Exactly. Okay, that makes logical sense for routine bugs. But you can't just con ban a massive multi-department initiative for the upcoming quarter. When leadership wants to know if an entirely new product feature is feasible, no estimate will get you laughed out of the room. Oh, for sure. What is the next level? That brings us to tier two. Lightweight breakdown. This is used for early roadmap direction. Okay, how does that work? The goal here isn't a binding contract. The goal is to figure out if the core intent is mathematically realistic. Extra large isn't good enough. So the team takes the big epic and breaks it down until they reach concrete tasks that take about one to two days. Got it. So you're shifting the conversation from does this feel like a 13-point story to what actually physically needs to be done to make this happen? Yes, it's a discovery method. The act of breaking it down exposes massive hidden risks that an abstract point system just entirely obscures. Like finding out you need a whole new database. Exactly. If breaking it down reveals that you need a brand new authentication system that hasn't been built yet, the answer isn't to guess harder. The answer is to investigate that gap. All right, but what happens when predictability is mission critical? Let's say you're dealing with regulatory compliance deadlines, fixed vendor contracts, or hardware that has to be ordered six months in advance. The heavy hitters. Right. You can't just do a lightweight breakdown when millions of dollars are locked into a hard date. That is where we deploy tier three. WBS-based estimation or work breakdown structure. WBS. I have to admit that sounds like a dirty word from the old waterfall days of project management. Agile purists usually run screaming from WBS. Oh, they do. It does carry heavy baggage, but the author strongly reclaims it for high-stakes environments. Okay, how so? WBS forces the team to start with the final deliverable and rigorously deconstruct it into major work streams and then into individual components until the work is small enough to be genuinely understood. It makes me think of planning a mission to Mars. You don't just sit in a circle and guess that getting to Mars is an extra large effort. Right, you never make it. That is a perfect analogy. You are mapping the precise structural reality of the challenge. And the real power of WBS isn't like the format of the spreadsheet, it's the cognitive shift it forces in the team. Which is. Well, remember earlier when we said story points compress complexity into a single number, WBS does the exact opposite. WBS expands complexity into a structured map. So what does this all mean? Let me make sure I've got this. You don't need a formal blueprint to assemble a basic bookshelf that's no estimate or flow. Yep. You might sketch out a rough plan to build a shed in the backyard, that's lightweight breakdown. But you absolutely need a detailed structural breakdown if you're building a house with plumbing and electricity, and that's WS. That is exactly it. Let's dig into that cognitive shift you mentioned, because I think this is the deepest insight in the entire newsletter. The idea that estimation isn't just about time, it's actually a diagnostic test for the health of your entire system. If we connect this to the bigger picture, this is where the magic of WBS happens. Okay. Imagine a team is asked to estimate a feature called Integrate New Customer Approval Workflow. In a standard agile refinement session, they might argue for 10 minutes about whether it's eight points or 13 points. And they eventually agree on eight, they write it down, and they move on. Exactly. The output is just an abstract number. Right. They guessed the size, but they didn't map the execution. But in a WBS discussion, the output is a brutal set of hard, concrete questions. The team is forced to mentally dry run the integration. Like which specific back-end systems are involved. Yes. Do we need to write a new API contract? Does this new workflow break our existing audit logs? Are there GDPR compliance requirements we have to clear? Exactly. Do we need to migrate 10 years of historical data to make the new workflow function? Who is writing the documentation for customer support? Wow. It forces them to stop abstract guessing and start concrete problem solving. It pulls all the invisible software complexity right out into the open. Exactly. And this leads to the golden rule of the source material. A hard-to-estimate system is often a hard-to-change system. Wait, second? A hard-to-estimate system is often a hard-to-change system. Think about what that WBS discussion just revealed. If your team cannot estimate a basic workflow change without realizing they have to involve five other teams, three external suppliers, and a legacy database nobody understands. Then your problem isn't that your developers are bad at estimating. Your problem is that your architecture is broken. It does not support independent change. The estimate is just the canary in the coal mine. It's screaming that your systems are too tangled to move quickly. This raises a critical question about management's psychological reaction to this data. When the team does the WBS and uncovers all this terrifying complexity, how do they communicate it responsibly without management just weaponizing it? Well, the source makes a very explicit distinction here between an estimate and a commitment. An estimate is the data. The commitment is a business decision based on that data. Right. The author gives an example of what a truly responsible estimate looks like. And it's never a single date, it's a range paired with explicitly stated assumptions. Yes. A responsible estimate sounds like this. This delivery is estimated at 10 to 14 weeks, assuming the supplier API is available by Sprint 3. The security compliance review takes no more than two weeks, and absolutely no historical data migration is needed. And if they have to migrate the historical data, the estimate is void and has to be recalculated. That is incredibly honest. It is. But why do managers so often strip those assumptions away and just demand 10 weeks? Again, it's the psychological need for control. A single number feels actionable. A range with caveats feels messy and evasive. But single numbers invite catastrophic misunderstanding. Ranges are mathematically and professionally honest. Management's job is to use this transparent information, the dependencies, the assumptions, the structural risks, to make strategic business decisions. Not to convert that systemic uncertainty into high pressure deadlines on the development team. Exactly. Not to convert uncertainty into pressure. If I could put that on a billboard outside every corporate headquarters, I would. Seriously? Oh, and we briefly skipped over earlier, but there was a bonus fourth tier the author mentioned for when even WBS fails. Yes. Exploration before estimation. So what happens there? Sometimes a request is so genuinely unprecedented or unclear that you can't even begin to break it down. When that happens, the correct professional response is not to pull a number out of thin air just to appease a stakeholder. Right. Don't just guess. The correct response is to do a spike or a prototype. Admitting you don't know enough yet to estimate isn't a weakness, it is the ultimate sign of professional maturity. I love that. So to bring it all home, the core takeaway here is that estimation will never ever remove uncertainty from complex work, and we need to stop pretending that it does. Absolutely. The goal isn't to guess faster or to guess with more false precision. The goal is to use the estimation process as a diagnostic tool to understand the work and the architecture it lives in better. And crucially, to use the lightest possible method, whether that's flow, a lightweight breakdown, or a full WBS that still provides the level and understanding you need for the specific stakes at hand. Beautifully said. Before we leave you, please remember to click the link in your notes to read and subscribe to the full Agile Software Engineering newsletter. Yeah, you really don't want to miss the visuals. We covered a lot of ground today, but the illustrations and the in-depth content we couldn't fit in are incredibly valuable. And if you enjoyed this deep dive, hitting the like button and subscribing to receive future issues is a free, fantastic way to support what we do. It means a lot to us. Now I want to leave you with a final thought to mull over. Okay, let's hear it. We've spent this entire time talking about software teams and enterprise architecture. But think about the tasks in your own personal or professional life that you constantly underestimate or consistently fail to complete on time. Oh boy, here we go. Are you simply bad at managing your time? Or is your personal architecture, your hidden dependencies, your daily interruptions, your undocumented mental load just too tangled and complex to allow for independent change? Oof, that hits hard. Next time someone asks me exactly how long is this gonna take, I'm gonna tell them I need to review my personal architecture first. Thanks for joining us on this deep dive. We'll catch you next time. If you found this episode valuable, feel free to share it with someone who might benefit. A colleague, your team, or your network. You can access all episodes by subscribing to the podcast and find their written counterparts in the Agile Software Engineering newsletter on LinkedIn. And if you have thoughts, ideas, or stories from your own engineering journey, I'd love to hear from you. Your input helps shape what we explore next. Thanks again for tuning in, and see you in the next episode.

Podcasts we love

Check out these other fine podcasts recommended by us, not an algorithm.

Darknet Diaries Artwork

Darknet Diaries

Jack Rhysider