For more than 50 years, Auerbach Publications has been printing cutting-edge books on all topics IT.

Read archived articles or become a new subscriber to IT Today, a free newsletter.

This free newsetter offers strategies and insight to managers and hackers alike. Become a new subscriber today.



Interested in submitting an article? Want to comment about an article?

Contact John Wyzalek editor of IT Performance Improvement.


Too Many Metrics and Not Enough Data

Capers Jones

The software industry is unique in having more metric variants than any other engineering discipline in history, combined with an almost total lack of conversion rules from one metric to another. As a result, producing accurate benchmarks of software productivity and quality is much harder than for any other engineering field.

There are other engineering disciplines that have multiple metrics. For example we have nautical miles, statute miles, and kilometers for measuring speed and distance. We have Fahrenheit and Celsius for measuring temperature. We have three methods for measuring the octane ratings of gasoline. However, other engineering disciplines have conversion rules from one metric to another.

The author has identified five distinct variations in methods for counting lines of code, and more than 20 distinct variations in counting function point metrics. There are no standard conversion rules between any of these variants although there are some conversion rules between COSMIC and IFPUG function points.

Why Multiple Metrics Harm the Software Industry

Suppose you are a consultant who has been commissioned by a client to find data on the costs and schedules of producing a certain kind of software, such as a PBX switching system.

You scan benchmark data bases and discover that data exists on 75 similar projects. You would like to perform a statistical analysis of the results for the client. But now the problems begin when trying to do statistical analysis of the 75 samples:

  1. Three were measured using lines of code, and counted physical lines.
  2. Three were measured using lines of code, and counted logical statements.
  3. Three were measured using lines of code, and did not state the counting method.
  4. Three were constructed from reusable objects and only counted custom code.
  5. Three were constructed from reusable objects and counted reuse + custom code.
  6. Three were measured using IFPUG function point metrics.
  7. Three were measured using COSMIC function point metrics.
  8. Three were measured using Full function points.
  9. Three were measured using Mark II function point metrics.
  10. Three were measured using FESMA function points.
  11. Three were measured using NESMA function points.
  12. Three were measured using unadjusted function points.
  13. Three were measured using Engineering function points.
  14. Three were measured using legacy data mining tools
  15. Three were measured using Web-object points.
  16. Three were measured using Function points light.
  17. Three were measured using backfire function point metrics.
  18. Three were measured using Feature points.
  19. Three were measured using Story points.
  20. Three were measured using Use Case points.
  21. Three were measured using MOOSE metrics.
  22. Three were measured using goal-question metrics.
  23. Three were measured using TSP/PSP task hours.
  24. Three were measured using RTF metrics.
  25. Three were measured using pattern-matching function points.

As of 2010 there are no proven and effective conversion rules between any of these metrics. There is no effective way of performing a statistical analysis of results expressed in multiple metrics. Why the software industry has developed so many competing variants of software metrics is an unanswered sociological question.

Developers of new metrics almost always fail to provide conversion rules between their new version and older standard metrics such as IFPUG function points. In the author’s view it is the responsibility of the developers of new metrics to provide conversion rules to older metrics.

It is not the responsibility of organization such as IFPUG to provide conversion rules to scores of minor metrics and minor variations in counting practices. As of 2010 the plethora of ambiguous metrics is slowing progress towards a true economic understanding of the software industry.

Function Point Metrics for Activity-Based Analysis

Many metrics are useless for economic comparisons. For example story points only work for projects utilizing user stories. Use-case points only work for use cases.

Function point metrics are general-purpose metrics that span all methods and all phases and activities. This is why the major benchmark groups such as the International Software Benchmark Standards Group (ISBSG) only support function point metrics. Table 1 illustrates the versatility of function points by showing typical results for 10 activities for the PBX switch discussed at the start of this article.

Table 1. Example of 10 Activities Measured with Function Points

As of 2010 function points are the only available metric that can be used across all activities and across all types of software. Neither lines of code, story points, nor use-case points can be used to create similar tables.

That being said, all of the other forms of function points share the same versatility as IFPUG function points. That is tables similar to Table 1 could also be created using COSMIC function points, NESMA function points, function points light, and all of the other functional metric variants.


As of 2010 the software industry has far too many metrics combined with a serious shortage of actual data on software project costs, schedules, quality, and other quantitative results. Function point metrics provide a general-purpose and versatile metric that can be used to measure every activity in every kind of software project. No other metrics have this kind of versatility.♦


Garmus, David and Herron, David; Function Point Analysis—Measurement Practices for Successful Software Projects; Addison Wesley Longman, Boston, MA; 2001; 363 pages.

International Function Point Users Group (IFPUG); IT Measurement—Practical Advice from the Experts; Addison Wesley Longman, Boston, MA; 2002; 759 pages.

Hill, Peter; “The ISBSG body-of-knowledge and its uses”; ISBSG; Hawthorne, Victoria; Australia. Available from

Jones, Capers; Applied Software Measurement, 3rd edition; McGraw Hill, New York; 2008; 668 pages.

Read more IT Process Improvement

About the Author

Capers Jones is currently the chairman of Capers Jones & Associates, LLC. He is also the founder and former chairman of Software Productivity Research, LLC (SPR), where he holds the title of Chief Scientist Emeritus. He is a well-known author and international public speaker, and has authored many books including Software Engineering Best Practices: Lessons from Successful Projects in the Top Companies and Applied Software Measurement, 3rd edition. Jones and his colleagues from SPR have collected historical data from more than 600 corporations and more than 30 government organizations. This historical data is a key resource for judging the effectiveness of software process improvement methods. The total volume of projects studied now exceeds 12,000.

He can be contacted at