Generalised Regression
Regression problems can be classified according to the complexity of the error structure, and so-called "generalised regression" problems are those where there is correlation between the errors and/or where more than one measured variable is subject to error.
Introduction
Experimental data analysis is a key activity in metrology. The first stage involves building a mathematical model of the physical system in terms of mathematical equations that describe all the relevant aspects of the system. The model specifies how the system is expected to respond to input data.
At the second stage, the model solving or regression problem is: given measurement data, determine estimates of the model parameters by solving the mathematical equations constructed as part of the model, taking into account the error structure of the measurement data. In general, this requires the development of an algorithm to determine values of (and uncertainties in) the parameters that best explain the data. A common example of regression within metrology is the determination of the relationship between a control variable and a response variable, in the calibration of an instrument.
An Example - Natural Gas Analysis
An example of a generalised regression problem occurs in the analysis of natural gas mixtures. Cylinders containing natural gas are prepared gravimetrically to contain known compositions of each of 11 constituent components. Given:
- a number of primary standard natural gas mixtures containing known concentrations of one of the constituent components (e.g., CO2);
- the detector response for each mixture;
- the detector response for a new mixture;
we wish to determine the concentration of CO2 in the new mixture.
Solving the Problem
An approach to solving this problem is firstly to use the calibration data (relating to the primary gas mixtures) to calibrate the detector and, secondly to apply the new measurement to the calibration curve so constructed, to predict the concentration in the new mixture.
However the calibration data are known inexactly - the process of preparing the primary standards involves measurement error, and indeed the errors in the standards are correlated - for example, the gravimetric process used to prepare the standard mixtures involves comparing each standard mixture against calibrated masses selected from a common set of masses. The data returned by the detector (which is based on the analytical technique of chromatography) are also subject to measurement error. Consequently, in the data analysis we need to account for the inexactness of the measurement data, and to quantify the resulting uncertainty associated with the final measurement result.
Approaches for regression problems for errors in one variable (e.g. the response variable) are well known within metrology, whereas those for generalised regression are not. As part of the SSfM discrete modelling theme, algorithms and software for use by metrologists are being developed to solve such problems. The software will combine structure-exploiting linear algebra and numerically stable confidence and effectiveness that they currently enjoy with standard routines available in numerical components, and it is hoped that metrologists will be able to use these routines with the same libraries.
Find out more
- "The classification and solution of regression problems for calibration", NPL Report CMSC 24/03
