Affordable Access

Kernel Methods and Frequency Domain Independent Component Analysis for Robust Speaker Identification

Authors
Publisher
総研大甲第1333号

Abstract

CHAPTER 1 INTRODUCTION This dissertation is devoted to developing a useful speaker identification system for humanoid robots. In this chapter, we state the motivation and objective of our work. 1.1 Four issues in Speaker Identification for hu- manoid robots The speaker identification is one of the key technologies for person identification in humanoid robots. Especially, when the face information is not available, the speaker identification is the only way to identify person. Therefore, to improve the speaker identification performance is an important issue. There are four major issues in speaker identification for humanoid robots. First, the humanoid robots should identify the speaker in real-time with high identification rates. Second, since the speech features vary over time due to session dependent variation, the recording environment change, and physical conditions/emotions, the robust speaker identification system under the feature changes is required. Third, the humanoid robots should automatically add the speakers in dictionary, when the un- known speaker talks to it. Forth, since humanoid robots move throughout the world, the surrounding environment, source positions, and source mixtures are constantly changing. To cope with these issues, we address the following topics in this dissertation: 1 1. Kernel based real-time speaker identification with acceleration of Mean Opera- tor Sequence Kernel Computation 2. Semi-supervised Speaker Identification under Covariate Shift 3. Direct Importance Estimation with Gaussian Mixture Models and Probabilistic Principal Component Analyzers 4. Noise Adaptive Optimization of Matrix Initialization In what follows, we present a brief introduction to each of these topics. 1.2 Kernel based real-time speaker identification with Acceleration of Mean Operator Sequence Kernel Computation The humanoid robots should identify the speaker in real-time with high identifica- tion rates. In these days,

There are no comments yet on this publication. Be the first to share your thoughts.