Testing the Numerical Correctness of Scientific Software
Sample standard deviation and straight-line linear regression functions have been tested in a number of spreadsheet, statistical and scientific software packages used in metrology, revealing differing reliabilities of the various packages.
Two SSfM reports (CMSC 07/00 and CMSC 08/00) deal with testing the numerical correctness of scientific software used in metrology. Each report focuses on a particular numerical calculation and presents the results of testing functions for this calculation taken from a number of spreadsheet, statistical and scientific software packages used in metrology. The calculations considered are, respectively, sample standard deviation and straight-line linear regression, and the software packages covered are Excel, MathCAD, S-PLUS, MATLAB, the NAG Fortran library and the IMSL Fortran library.
The figure illustrates an example of the results obtained from testing the functions for sample standard deviation. A sequence of graded reference data sets is generated, where each data set has the same reference sample standard deviation but a larger sample mean than the previous set in the sequence. The sequence therefore mimics measurement data sets with increasing signal-to-noise ratio. The reference data sets are applied as input to each test function and the results returned are compared with the reference results. A performance measure is used for this comparison: it measures the number of significant figures of accuracy that are lost by the test function over and above the number that can be expected to be lost by a reference algorithm for the calculation. The figure shows in the form of performance profiles the values of the performance measure for the sequence of reference data sets and each test function. A near-flat performance profile is good. A strongly increasing profile is bad.
The results illustrated above indicate that the IMSL, NAG, MATLAB, S-PLUS and MathCAD packages provide reliable results for these reference data sets (albeit that MathCAD is slightly poorer than the others), whereas there is serious degradation in the performance of the function provided by Excel. The results suggest that the Excel function implements an unstable formula for calculating the sample standard deviation, whereas the other packages implement stable formulae for this calculation. The implementation of a common unstable formula for the standard deviation reproduced exactly the Excel results for the cases examined.
This tutorial is an abridgement of an article that first appeared in Counting on IT Issue 10.