> 연구성과 > 학술발표
책임교수 | 이의철 |
---|---|
논문명 | Speaker Diarization of Telemarketer-Client Recording for Speech Dictation System |
구분 | 구두발표 |
제1저자 | 배민경 |
교신저자 | 이의철 |
공동저자 | 정다해,김윤경,정진우 |
국내/국외 | 국외 |
학술회의명 | CSA 2014 |
개최국가 | 미국 |
개최일 | 2014.12.17 ~ 19 |
주관기관 | FTRA |
Financial institutions employ speech dictation systems that convert the conversation recordings between telemarketer and client into the texts. The dictation system is necessary for checking incomplete sales, in which a telemarketer fails to provide important sales information to a client. However, the manually performed dictation procedure takes too much time and effort. Automatic speech dictation system is being adopted as an alternative. We suggest that, in such an automatic speech dictation system, a speaker diarization to be performed prior to speech recognition. In this paper, we propose a diarization method based on pitch detection, which suits very well to given condition in which two speakers, telemarketer and client, make a conversation in a telephone recording. We suggest a method based on average short time spectral feature and unsupervised learning scheme. In the experiments, actual telephone recordings for insurance contraction are used. In average about 5% of actual telemarketer’s voice was classified as client. Also in average about 5% of client’s voice was classified as telemarketer’s.