Most gear fault diagnosis (GFD) approaches suffer from inefficiency when facing with multiple varying working conditions at the same time. In this paper, a non-negative matrix factorization (NMF)-theoretic co-clustering strategy is proposed specially to classify more than one task at the same time using the high dimension matrix, aiming to offer a fast multi-tasking solution. The short-time Fourier transform (STFT) is first used to obtain the time-frequency features from the gear vibration signal. Then, the optimal clustering numbers are estimated using the Bayesian information criterion (BIC) theory, which possesses the simultaneous assessment capability, compared with traditional validity indexes. Subsequently, the classical/modified NMF-based co-clustering methods are carried out to obtain the classification results in both row and column tasks. Finally, the parameters involved in BIC and NMF algorithms are determined using the gradient ascent (GA) strategy in order to achieve reliable diagnostic results. The Spectra Quest’s Drivetrain Dynamics Simulator gear data sets were analyzed to verify the effectiveness of the proposed approach.