Function Points are a well known metric for software sizing. If you are unfamiliar with the concept, take a look at Wikipedia’s entry for Function Point. In this post we argue that Function Points are an inadequate tool for estimating modern applications — and in particular contemporary web applications.
The Reference Project
To make the discussion more concrete, we relate the discussion to a specific project, characterized as follows:
Project: a web based application.
Process: eXtreme Programming (XP) with aggressive one week iterations (the only dismissed practice was pair programming).
Development Environment: Ruby on Rails (RoR).
Benchmark Reference: A module that required to produce a report generated through calculations performed on data extracted from a database.
Estimation Method: Wide-band Delphi estimation via Story Points with Fibonacci sequences, as described in [COHN-2005]. Story points represented idealised effort, not time.
Estimate: 3 iterations, based on past velocity and release plan.
Actual: 9 iterations.
Retrospective: Delays due to underlying calculations/business rules that the client was not able to verbalise until after seeing the working code delivered at the end of each iteration. The client undervalued the complexity of the business rules; they were just taken for granted due to habitual work processes.
Function Point Analysis
After counting elements from the project’s final source code, a posthumous Function Point Analysis estimate looked as follows:
Since the counting of the elements was done posthumously, the values shown in the table are very different from what would have been expressed at the beginning of the project, despite the fact that Function Points are thought of as a means of complete and up-front estimating. There is no easy way to go back and repeat the estimate in the original conditions. However, the table presented above is sufficient to bring about the discussion that follows.
The final sum of 1,054 function points is surprisingly high. Either the team was overly hyper-productive — being approximately 28 times more productive than industry averages — or the above counting of function points is completely wrong.
The Chronologist favours the hypothesis of hyper-productivity (because, in general, that is the outcome of projects driven by him). That notwithstanding, we still think that using Function Points is wrong in general, as we will explain hereafter.
Counting What? And What Not?
The countable elements of Function Point Analysis (inputs, output, queries, files and interfaces) that underlie the determination of Function Points seem straightforward to be easily enumerated. It is evident that things have changed since the Function Points were conceived of in the late 70s. As [LAIRD-2006] put it:
Many of the terms and factors are from the 1970s and 1980s, and are difficult to understand in the context of today’s systems.
Even though Function Points have undergone efforts in order to keep them up to date, knowing exactly what to count is not straightforward. For instance, in the reference project, there were no traditional “files” to be counted; instead we resorted to counting Ruby on Rails “models” (or equivalently the underlying SQL tables), which can be approximated as a representation of a traditional data file.
The countability problem has puzzled many researches. In lack of any better indication, the approach we adopted to count Ruby on Rails models was inspired by [LAVAZZA-2008] and [LEVESQUE-2008] where UML models were used as the basis for counting. The countability problem for web applications has been explored by [COSTAGLIOLA-2006] where the Function Point measures were complemented with other “length” measures (number of pages, number of media, number of client and server side scripts). If nothing else, this can be taken as evidence that the Function Point approach is insufficient for measuring modern web applications, and additional metrics are required. Earlier evidence about this can be found in [REIFER-2000]:
Many professionals would like to use the more traditional processes, metrics, and models for estimating Web projects. […] these traditional approaches do not seem to address the challenges facing the field. The major concern estimators face is in estimating size, because size drives most of their models. In response, they need new size metrics to accurately scope the work involved in projects that currently cannot be accurately estimated using SLOC and FPs.
(In fact, in that same paper, Reifer proposes alternative metrics based on “web objects.”)
The Technology Does NOT Count
One aim of Function Points is to measure functionality, independently of technology. This is a fallacy because it fails in practice. No project investment appraisal can be undertaken without consideration of the underlying technology, and the relative strengths and weaknesses. Somehow the authors of the Function Point approach were aware of the weakness, and tried to “factor in” technology aspects in the 14 environmental factors. [KITCHENHAM-1997] states:
The ordinal measures are added together and used to weight the raw function point count. Not only are the measures treated incorrectly, but there is no evidence that they improve any predictive models involving function points.
In a contemporary setting, estimating size in terms of technology independent functionality is no longer sufficient. Choice of technology can actually determine what is “functionally” possible. Furthermore, if such technology can contribute to achieving hyper-productivity (like in the case of Ruby on Rails), and increase team velocity and organisational throughput, it cannot be ignored. Function Points simply do not take these factors into consideration.
Since the only real, tangible, value-adding artefact of any software development project is delivered fully working code in whatever programming language any attempt at measuring a project size without considering the underlying raw material (programming language, technology, application stacks, etc.) will not be significant, but illusionary. While the intent of finding a technology-independent way of sizing a software project does, on the surface, appear compelling in reality it is a major shortcoming.
The User Experience Does NOT Count
Estimating size in terms of functionality alone is no longer sufficient because unlike the late 70’s when green text terminals where the high-tech user interface, today’s advanced browser based applications that have to function equally well on a mobile device as well as on a 30 inch plasma display, the end-user experience has to be accounted for. Function Points come from an era when COBOL and batch processing were mainstream, but contemporary “applications do more than transform inputs to outputs” [REIFER-2000].
Functionality alone does not account for all the effort. How that functionality is packaged, dressed up and presented accounts for a lot of additional effort that is not considered by Function Points. The work of user interface designers, graphic designers, and even sound/video designers all contribute to the making of a compelling user experience that can fundamentally alter the amount of effort needed to deliver the final project, and critically influence its final success.
Developers Do NOT Count
At the end of [SPOLSKY-2007] there is a statement generally accepted in contemporary software development: “Only the programmer doing the work can create the estimate.” Function Points deliberately ignore the programmer’s voice in establishing the estimates. Some Function Point proponents go as far as saying that “Anyone can count function points!” This disqualifies Function Points entirely in respect of what is considered a sound practice in contemporary software development.
On the surface, Function Points are similar to Story Points. Even Story Points intend to measure the effort needed to deliver functionality, but the very significant difference is that in Agile approaches only the developers are allowed to express the estimates [COHN-2005] and therefore any technology impact is factored into their expert opinions. Deep and knowledgeable expert consideration of the underlying technology is intrinsically part of the estimate in Story Points, while Function Points discount all technical considerations entirely, and make up for the deficiency with magic numbers, like the multipliers, weights and adjustment factors that have to be tweaked to make the estimates better fit the real world.
Math Does NOT Count
In the vitriolic article [JOHNSON-2004] it is made very clear that Function Points are ratings, and not measurements due to the subjective manner in which they are derived.
Hence, they cannot be manipulated mathematically. And yet the software literature is rife with examples of researchers attempting to do just that. […] The notion that FPs can participate in mathematical calculations, and thereby be used for scheduling, effort and productivity measures, is without theoretical or empirical basis.
Function Points, despite the many papers written, and being an approved ISO Standard, are not convincing. If you have to make a hard choice between Function Points and any other evidence based estimation metrics, it is more likely that the latter would work better. Function Points are a relic of the past; they are based on perceptions and opinions; they use many magic numbers and correction factor; they do not consider technology and its consequences on the project delivery life-cycle; they discount user experience and its development; they ignore the expert opinion of programmers; they are not based on math or real metrics. In short: Function Points are Fantasy Points!