Abstract It is shown how two of the most common types of feature mapping used for classification of single trial Electroencephalography (EEG), i.e. spatial and frequency filtering, can be equivalently performed as linear operations in the space of frequency-specific detector covariance tensors. Thus by first mapping the data to this space, a simple linear classifier can directly learn optimal spatial + frequency filters. Significantly, if the classifier’s loss function is convex, learning these filters is a convex minimisation problem. It is also shown how to pre-process the data such that the resulting decision function is robust to the biases inherent in EEG data. Further, based upon ideas from Max Margin Matrix Factorisation, it is shown how the trace norm can be used to select solutions which have low rank. Low rank solutions are preferred as they reflect prior information about the types of EEG signals we expect to see, i.e. that the classifiable information is contained in only a few spatio/spectral pairs. They are also easier to interpret. This feature-space transformation is compared with the Common-Spatial-Patterns on simulated and real Imagined Movement Brain Computer Interface (BCI) data and shown to give state-of-the-art performance.