10.1 The
Objective Function
To treat network training as an optimization problem, an
objective function (or cost function) must be defined that provides an
unambiguous numerical rating of system performance. The cost function reduces
all the various good and bad aspects of a possibly complex system down to a
single number, a scalar value, which allows candidate solutions to be ranked and
compared. In short, it provides the working definition of optimal for the search
algorithm, telling it what kinds of solutions to look for. It is important,
therefore, that the function faithfully represent our design goals. If we choose
a poor error function and obtain unsatisfactory results, the fault is ours for
badly specifying the goal of the search.
Selection of the objective function can be a problem in
itself since it is not always easy to develop a function that measures exactly
what we want when goals are vague. It is often necessary to compromise between
what we want, what we can measure, and what we can optimize efficiently. A few
basic functions are very commonly used. The mean squared error is popular for
function approximation (regression) problems because of its convenience in
mathematical analysis. The cross-entropy error function is often used for
classification problems when outputs are interpreted as probabilities of
membership in an indicated class. In real-world applications, it may be
necessary to complicate the function with additional terms to balance
conflicting subgoals or to introduce heuristics favoring preferred classes of
solutions.