ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Comparison of low-dimension speech segment embeddings: Application to speaker diarization

Chetupalli, SR and Sreenivas, TV and Gopalakrishnan, A (2019) Comparison of low-dimension speech segment embeddings: Application to speaker diarization. In: 25th National Conference on Communications, NCC 2019, 20 February 2019 - 23 February 2019, Bangalore.

[img] PDF
NCC_2019.pdf - Published Version
Restricted to Registered users only

Download (4MB) | Request a copy
Official URL: https://doi.org/10.1109/NCC.2019.8732210

Abstract

Segment clustering is a crucial step in unsupervised speaker diarization. Bottom-up approaches, such as, hierarchical agglomerative clustering technique are used traditionally for segment clustering. In this paper, we consider the top-down approach to clustering, in which a speaker sensitive, low-dimensional representation of segments (speaker space) is obtained first, followed by Gaussian mixture model (GMM) based clustering. We explore three methods of obtaining the low dimension segment representation: (i) multi-dimensional scaling (MDS) based on segment to segment stochastic distances; (ii) traditional principal component analysis (PCA), and (iii) factor analysis (i-vectors), of GMM mean super-vectors. We found that, MDS based embeddings result in better representation and hence result in better diarization performance compared to PCA and even i-vector embeddings.

Item Type: Conference Paper
Publication: 25th National Conference on Communications, NCC 2019
Publisher: Institute of Electrical and Electronics Engineers Inc.
Additional Information: The copyright for this article belongs to Institute of Electrical and Electronics Engineers Inc.
Keywords: Cluster analysis; Embeddings; Gaussian distribution; Stochastic systems, Based clustering; Bottom up approach; Gaussian Mixture Model; Hierarchical agglomerative clustering; Low-dimensional representation; Multi-dimensional scaling; Speaker diarization; Top down approaches, Principal component analysis
Department/Centre: Division of Electrical Sciences > Electrical Communication Engineering
Date Deposited: 29 Nov 2022 05:38
Last Modified: 29 Nov 2022 05:38
URI: https://eprints.iisc.ac.in/id/eprint/78062

Actions (login required)

View Item View Item