National Physical Laboratory

Model Solving and Estimators

Model solving is the process of determining estimates of the model parameters from the measured data by solving the mathematical equations constructed as part of the model. In general, this involves developing an algorithm that will determine the values for the parameters that best explain the data. These algorithms are often referred to as estimators.

Approximation from a space of models

The space of models attempts to characterise all possible (or probable) behaviour of a particular type of system, e.g., the ways in which a response variable could vary with its covariates. Model solving is the process of determining from data gathered from a measurement system, a particular model that adequately represents the system behaviour. Constructing the model space is concerned with defining where we should look to explain the behaviour; model solving is concerned with selecting the best candidate from the options defined by the model space.

If the members of the model space are described by parameters a and the measurement data z is regarded as being generated by a system specified by parameters a*, then model solving amounts to providing an estimate of a* from z. A scheme for determining such an estimate from data we term an estimator.

Error functions and approximation norms

In general, estimators are defined using an error function F(a|z) that provides some measure of how well the data z matches the model behaviour specified by a. The estimate of a* is provided by (the estimate of) the minimiser of F(a|z), i.e., a point at which F takes a minimum value. Different estimators are associated with different error functions. Error functions are usually constructed to provide an aggregate measure of goodness of fit taking into account all the measurement data. These error functions are often related to approximation norms and the least-squares estimator is one of a family of estimators derived from such norms.

Estimator properties

Suppose that an experimental system is specified by parameters a*, measurements z have been gathered, resulting in parameter estimates a = A(z).  Regarding the measurements z as a set of observations of a vector of random variables Z with multivariate probability density function, then a is an observation of the vector of random variables A = A(Z). In principle, the probability density associated with A is determined by that for Z, and has a mean E(A) and variance V(A).

One measure of how good an estimator is given by the mean squared error (MSE) and the root mean squared error, (RMSE).  The RMSE is a measure of the likely distance of the estimate from experimental parameters a*. An estimator A is unbiased if E(A) = a*, in which case MSE(A) = V (A). An unbiased estimator with a small variance is statistically efficient. Efficiency is used in a relative sense to compare estimators with each other. The MSE depends on both the bias E(A) − a* and the variance V(A). An estimator A is consistent if the more data points we take in each data set z, the closer a = A(z) gets to a*.

Maximising the likelihood

Maximum likelihood estimation uses the fact that in a complete statement of a model, the deviations are modelled as belonging to statistical distributions defined in terms of probability density functions. These distributions can be used to define a likelihood function. The probability p(y|a) of observing the data y given that the model specified by parameters a can be regarded as function of a. In general, if z are observations of random variables Z, the likelihood ℓ(a|z) of a giving rise to data z is the same as the probability p(z|a).  The notation is used to indicate that we regard the likelihood as a function of the parameters a with the observed data z fixed, while p(z|a) is a function of z with a regarded as fixed.

The maximum likelihood estimate of a is that which maximises ℓ(a|z), i.e., that which provides the most probable explanation of the data z. Maximum likelihood estimates enjoy favourable properties with respect to bias and statistical efficiency and usually represent an appropriate method for determining parameter estimates.  Many standard parameter estimation methods can be formulated as maximum likelihood estimation for particular statistical models for the random effects.

Last Updated: 13 Apr 2012
Created: 5 Jun 2007