The Linear Programming Algorithm for Neural Networks John S. Shawe-Taylor and Dave A. Cohen Department of Computer Science Royal Holloway and Bedford New College Egham Surrey TW20 0EX UK 16th June, 1989 For reprint requests contact John Shawe-Taylor at the above address. Tel: (0784) 439021. Running title: Linear Programming Algorithm 1 The Linear Programming Algorithm for Neural Networks Abstract A new learning algorithm for feed-forward neural networks based on linear program- ming is introduced. This alternative to back-propagation gives faster and more reliable learning on reasonably sized examples. Extensions of the method for efficient (approxi- mate) implementations in large networks are considered. Keywords: Neural networks, back-propagation, learning, feed-forward, linear program- ming algorithm, parallel distributed processing, network simulations. 2 1 Introduction Back-propagation using the generalised delta rule was introduced as the solution to the credit-assignment problem which existed for threshold neurons [Ru-Hi-W86], thus opening up the possibility of learning in multi-layer perceptron networks. Back-propagation (BP) is an efficient method of calculating the partial derivatives of the error with respect to changes in the weights, and then using a gradient descent algorithm which changes the weights in the direction of fastest error reduction over all stimuli simultaneously. The algorithm is criticised by Minsky and Pappert [Mi-Pa88] since not only is there no guarantee that the algorithm will not get trapped in a local minimum, but also it is not clear that it represents a significant speed up over random weight assignments in realistically sized examples. These criticisms are, perhaps, overstated. Though local minima do exist [He88], they seem in most cases not to be a problem. The PDP book [Mc-Ru86] does present many interesting examples where the BP algorithm has successfully learnt certain representa- tions.