Affordable Access

Time-Frequency Cepstral Features and Combining Discriminative Training for Phonotactic Language Recognition

Publication Date
  • Phonotactic Language Recognition
  • Phone Recognizer
  • Time-Frequency Cepstrum (Tfc)
  • Feature Minimum Phone Error (Fmpe)
  • Musicology
  • Physics


The performance of the phonotactic system for language recognition depends on the quality of the phone recognizers. To improve the performance of the recognizers, this paper investigates the use of new acoustic features and discriminative training techniques for phone recognizers. The commonly used features are static ceptral coefficients appended with their first and second order deltas. This configuration may be not optimal for phone recognition in phonotactic language recognition systems. In this paper, a time-frequency cepstral (TFC) feature is proposed based on our previous work in acoustic language recognition systems. The feature is extracted as follows: first a temporal discrete cosine transform (DCT) is carried out on the cepstrum matrix, and then select the transformed elements in a specific area using the variance maximization criterion. Different parameters are tested to obtain the optimal configuration. Also, we adopt the feature minimum phone error (fMPE) method for discriminative training of phone models to obtain better phone recognition results for further improvement. The effectiveness of the two techniques is demonstrated on the NIST Language Recognition Evaluation (LRE) 2007database, including the 30 second, 10 second and 3 second closed-set test conditions.

There are no comments yet on this publication. Be the first to share your thoughts.