Dealing with nonstationary processes requires quick adaptation while at the same time avoiding catastrophic forgetting. A neural learning technique that satisfies these requirements, without sacrificing the benefits of distributed representations, is presented. It relies on a formalization of the problem as the minimization of the error over the previously learned input-output patterns, subject to the constraint of perfect encoding of the new pattern. Then this constrained optimization problem is transformed into an unconstrained one with hidden-unit activations as variables. This new formulation leads to an algorithm for solving the problem, which we call learning with minimal degradation (LMD). Some experimental comparisons of the performance of LMD with backpropagation are provided which, besides showing the advantages of using LMD, reveal the dependence of forgetting on the learning rate in backpropagation. We also explain why overtraining affects forgetting and fault tolerance, which are seen as related problems.