The May/June 2006 issue of IEEE Software published an interesting article that analyzed the estimation results of an extensive set of projects from Landmark Graphics. The author, Todd Little, analyzed the relationships between estimated outcomes and actual outcomes. Based on his data, he concluded that the 80% confident range of estimates did not reduce as the Cone of Uncertainty implies, but that the estimates continued to vary by about a factor of 3-4 for the remaining work on the project -- regardless of when in the project the estimate was created.
There are some interesting takaways from the article's data, and some of its conclusions are supported by the data, whereas others are not. The basic issue with the article's data is that it represents estimation accuracy as estimation commonly occurs in practice rather than
estimation accuracy when estimation is done well.
Figure 5 in Little's article is particularly interesting:

Figure 5 from "Schedule Estimation and Uncertainty Surrounding the Cone of Uncertainty."
Figure 5 shows a scatter plot of estimates created at different points in a project's duration. The scatter plot forms a near perfect cone--but only the half of the Cone that represents underestimation! There is only a tiny scattering of points that represent overestimation (those below the 1.0 line). As a view of estimation in practice, this is consistent with data my company has seen from many of our clients. It supports the conclusion that the software industry doesn't have a neutral estimation problem; it has an underestimation problem. (This is my conclusion, not the article's.)
The article's conclusions about the Cone of Uncertainty are less well supported. With reference to Figure 5, Little makes the observation that it forms a visual Cone, but only because the graph plots "estimated remaining duration" vs. "current position in the schedule." He points out that, since the duration remaining decreases as the project progresses, smaller estimation errors later in a project are not necessarily better. For the improved estimates to be accurate (i.e., for the Cone to be true), the estimates would need to be more accurate on a percentage-remaining basis, not just have a smaller absolute error. That analysis is all correct as far as I am concerned.
The article then goes on to point out that the relative error of the Landmark estimates didn't actually decrease, and concludes
"While the data supports some aspects of the cone of uncertainty, it doesn’t support the most common conclusion that uncertainty significantly decreases as the project progresses. Instead, I found that relative remaining uncertainty was essentially constant over the project’s life."
There are two reasons that this particular conclusion can't be drawn from Landmark's underlying data.
First, the article misstates the "common conclusion" about the Cone. As I’ve emphasized when I’ve written about it, the Cone represents best-case estimation accuracy; it’s easily possible to do worse—as many organizations have demonstrated for decades. Anyone who's ever worked on a project that got to "3 weeks from completion," and then slipped 6 weeks, and then got to "3 weeks from completion" again, and then slipped another 6 weeks, knows that uncertainty doesn't automatically decrease as a project progresses. The Cone is a hope, but not a promise. Little's data simply says that the estimates in the Landmark data set weren't very accurate. It's interesting to have this data put into the public eye, but it doesn't tell us anything we didn't already know. It tells us that software projects are routinely underestimated by a lot, and that projects aren't necessarily estimated any better at the end than they were at the beginning. That's a useful reminder, as long as we don't stretch the conclusions beyond what the underlying data supports.
The second problem with the conclusion the article draws about the Cone is that it doesn’t account for the effect of iterative development. Although it isn't stated in the published article, an earlier draft of the article, circulated on the Internet in mid 2003, emphasized that the projects in the data set were using agile practices, and in particular that they emphasized responding to change over performing to plan. In other words, the projects in this data set experienced significant requirements churn.
If the projects averaged 329 days as the article says, and if they followed agile practices as Little described in the 2003 version, there could easily be five to 10 iterations within each project. But the Cone applies to single iterations of the requirements-design-build-test process. For an analysis of the Cone of Uncertainty to be meaningful in a highly iterative context, the article would need to account for the effect of iteration on the Cone by looking at each iteration separately -- that is, by looking at 1-2 month iterations rather than looking at 329-day-long projects. The 329 day long projects are essentially sequences of little projects, so the way the Cone of Uncertainty applies in this case is that there isn't one big 329-day Cone; there are 6-12 1-2 month Cones instead. Unfortunately, the article doesn't present the iteration data; it presents only the rolled-up 329 data, which is unfortunately meaningless in terms of drawing any conclusions about how the Cone affects estimation accuracy over the course of a project.
The fact that requirements were treated in a highly iterative way also forces a reexamination of Figure 5. While it makes sense initially to treat Figure 5 as evidence of systemic underestimation, that conclusion can't be drawn either, because the requirements changed significantly over the course of the average 329 day project, and so whatever was delivered at the end of the project was not the same thing that was estimated at the beginning of the project, and that makes the early-project estimates and the late-in-the-project estimates an apples-to-oranges comparison, i.e., not meaningful.
Little makes an interesting comment at the end of the article that I think is a good takeaway overall. He points out that some of the variation in estimation accuracy was due to "a corproate culture using targets as estimates." Figure 5 might not provide a meaningful view of estimation accuracy, but it can certainly be interpreted as an indication that projects tend to set aggressive targets and then repeatedly fail to meet those targets. That's something we already knew, too, but it's good to have a reminder, and it's good to see that reminder supported with some data.
Resources