Abstract We present a novel canonical automaton model, based on register automata, that can be used to specify protocol or program behavior. Register automata have a finite control structure and a finite number of registers (variables), and process sequences of terms that carry data values from an infinite domain. We consider register automata that compare data values for equality. A major contribution is the definition of a canonical automaton representation of any language recognizable by a deterministic register automaton, by means of a Nerode congruence. This canonical form is well suited for modeling, e.g., protocols or program behavior. Our model can be exponentially more succinct than previous proposals, since it filters out ‘accidental’ relations between data values. This opens the way to new practical applications, e.g., in automata learning.