GP-SPEC:  Specify a Gaussian process model, or display existing spec.

Gp-spec creates a log file containing a specification of a Gaussian
process model and the associated priors over hyperparameters.  When
invoked with just a log file as argument, it displays the
specifications of the Gaussian process model stored in that log file.

Usage:
  
    gp-spec log-file N-inputs N-outputs 
      [ const-part [ linear-part [ jitter-part ] ] ] { flag } [ spread ]
      { / scale-prior relevance-prior [ power ] { flag } }

or:

    gp-spec log-file

N-inputs and N-outputs are the numbers of input variables and output
variables in the model - ie, the dimensionality of the domain and the
range of the functions that the Gaussian process defines a
distribution over.  For the Gaussian processes allowed here, the
functions from the inputs to the various output variables are
independent, given particular values for the hyperparameters.  The
covariance functions for the each output all have the same form, and
share the same hyperparameters.

The const-part argument gives the prior for the hyperparameter
controlling the constant part of the covariance function, in the form
described in prior.doc.  A value of c for this hyperparameter adds a
term of c^2 to the covariance between all pairs of input points.  The
prior for this hyperparameter can have only one level.  If this
argument is "-", or missing altogether, the covariance function will
not have a constant part.

The linear-part argument gives the prior for the hyperparameters, s_i,
controlling the linear part of the covariance function.  This part of
the covariance between inputs x and x' is

       SUM_i x_i x'_i s_i^2 

This prior can have up to two levels, allowing for a common
hyperparameter that controls the priors for the coefficients, s_i,
associated with the various inputs.  The "x" option may be used in
order to have the width of this prior scale with the number of inputs,
as described in prior.doc.  The linear part may be "-", or missing, in
which case the covariance function will not have a linear part.

The jitter-part argument gives the prior for a hyperparameter whose
square is added to the covariance of a training or test case with
itself.  Such a contribution may be desirable for modeling reasons, or
in order to improve the numerical methods.  If a regression model with
noise is being used, it is usually not necessary to include such a
contribution to the covariance; if jitter-part is included, it
effectively adds to the noise level (this could be useful if you want
to constrain the noise to be at least some amount).  For a
classification model, a jitter-part should usually be included, as
otherwise the updates of the underlying function values may be very
slow, and numerical problems can arise.  If the jitter-part is "-" or
missing, it is taken to be zero.

Zero or more additional terms in the covariance function may be
specified using further groups of arguments.  A group for which
"power" is absent or positive results in a term in the expression for
the covariance between a particular output at inputs x and x' that has
the form:

      v^2 * exp( - SUM_i (w_i |x_i - x'_i|)^R )

The power, R, in this expression must be in the interval (0,2].  It is
given by the last argument in a group, with the default being R=2.
The first argument in the group gives the prior for the scale
hyperparameter, v, which must have only one level.  The prior for the
w_i, which determine the relevance of the various inputs, can have up
to two levels, allowing for a common hyperparameter and for
hyperparameters for each input, i.  Here again, the "x" option may be
used to automatically scale the prior.

A group can also have a "power" of -1, in which case it produces a
term in the covariance function of the form

      v^2 / PROD_i (1 + w_i^2 (x_i - x'_i)^2 )

The v and w hyperparameters play the same roles as described above.

Optional flags may be appended to the specifications of the linear
and other parts of the covariance.  They have the form:

      <flag>[:[-]<input>{,<input>}]

where <flag> is one of the flag names below, and <input> is the number
of an input (starting with 1) that the flag applies to.  If the "-" is
present, the flag applies to all inputs EXCEPT those mentioned.  If
only the flag name is given, it applies to all inputs.  The possible
flags are as follows:

    delta     Use a "delta" distance for these inputs, in which the
              distance is 0 if x_i=x'_i and 1 otherwise.  Not allowed
              for the linear part of the covariance.

    omit      Ignore these inputs when computing the covariance
              (ie, don't include them in the sum above).  

    spread    Spread out the relevance parameters for these inputs.
              This is presently allowed only for the linear part, and
              causes this term in the covariance function to become

                 SUM_i x_i x'_i SUM_j s_j^2 
 
              The sum over j includes only i when i is not marked for 
              spreading.  When i is being spread, j includes all 
              indexes from i-spread to i+spread that are marked for 
              spreading.  Here, "spread" is the spreading width, which 
              must be specified after the flags.

The "delta" flag is useful when the inputs are categorical.  The
"omit" flag is useful in setting up additive models, in which
different covariance parts correspond to additive components, with
each component looking at only a subset of the inputs.  The "spread"
flag is useful for data such as spectra where nearby inputs are
likely to be of similar relevance.

Note that the prior for the const-part corresponds closely to the
prior for output biases in neural network models, and that the prior
for the linear-part corresponds to the prior for input-output weights
in a network.  The priors for v and w_i correspond to the priors in
neural network models on the hidden-output and the input-hidden
weights.  (The neural network priors can go to more levels, however.)

To use a Gaussian process to model data, additional information must
be specified as well, as described in model-spec.doc.  Depending on
the model used, state information in addition to the hyperparameter
values for the Gaussian process may need to be kept (case-specific
function values and/or case-specific noise variances).

            Copyright (c) 1996, 1998 by Radford M. Neal