ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Browse by Journal / Conference

Up a level
Export as [feed] Atom [feed] RSS 1.0 [feed] RSS 2.0
Number of items: 58.

Roy, A and Belagali, V and Ghosh, PK (2022) Air tissue boundary segmentation using regional loss in real-time Magnetic Resonance Imaging video for speech production. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 3113-3117.

Bhattacharya, D and Dutta, D and Sharma, NK and Chetupalli, SR and Mote, P and Ganapathy, S and Chandrakiran, C and Nori, S and Suhail, KK and Gonuguntla, S and Alagesan, M (2022) Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 2473-2477.

Bhattacharya, D and Dutta, D and Sharma, NK and Chetupalli, SR and Mote, P and Ganapathy, S and Chandrakiran, C and Nori, S and Suhail, KK and Gonuguntla, S and Alagesan, M (2022) Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptoms. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 1957-1958.

Bhanushali, A and Bridgman, G and Deekshitha, G and Ghosh, P and Kumar, P and Kumar, S and Kolladath, AR and Ravi, N and Seth, A and Seth, A and Singh, A and Sukhadia, VN and Umesh, S and Udupa, S and Durga Prasad, LVSV (2022) Gram Vaani ASR Challenge on spontaneous telephone speech recordings in regional variations of Hindi. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 18 - 22 September 2022, Incheon, pp. 3548-3552.

Dutta, D and Bhattacharya, D and Ganapathy, S and Poorjam, AH and Mittal, D and Singh, M (2022) Interpretable Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 2863-2867.

Agarwal, S and Ganapathy, S and Takahashi, N (2022) Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 3013-3017.

Rath, SP and Bandarupalli, TS and Shah, N and Onoe, N and Ganapathy, S (2022) Semi-supervised Acoustic and Language Modeling for Hindi ASR. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 3528-3532.

Chetupalli, SR and Ganapathy, S (2022) Speaker conditioned acoustic modeling for multi-speaker conversational ASR. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 3834-3838.

Udupa, S and Illa, A and Ghosh, PK (2022) Streaming model for Acoustic to Articulatory Inversion with transformer networks. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 625-629.

Jayesh, MK and Sharma, M and Vonteddu, P and Shaik, MAB and Ganapathy, S (2022) Transformer Networks for Non-Intrusive Speech Quality Prediction. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 4078-4082.

Siddarth, C and Udupa, S and Ghosh, PK (2022) Watch Me Speak: 2D Visualization of Human Mouth during Speech. In: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, 18 - 22 September 2022, Incheon, pp. 3667-3668.

Muguli, A and Pinto, L and Nirmala, R and Sharma, N and Krishnan, P and Ghoshy, PK and Kumar, R and Bhat, S and Chetupalli, SR and Ganapathy, S and Ramoji, S and Nanda, V (2021) DiCOVA challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics. In: 22nd Annual Conference of the International Speech Communication Association, 30 - 3 September 2021, Brno, pp. 4241-4245.

Udupa, S and Roy, A and Singh, A and Illa, A and Ghosh, PK (2021) Estimating articulatory movements in speech production with transformer networks. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 3156-3160.

Avila, F and Poorjam, AH and Mittal, D and Dognin, C and Muguli, A and Kumar, R and Chetupalli, SR and Ganapathy, S and Singh, M (2021) Investigating feature selection and explainability for COVID-19 diagnostics from cough sounds. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 4246-4250.

Singh, P and Varma, R and Krishnamohan, V and Chetupalli, SR and Ganapathy, S (2021) LEAP submission for the third DIHARD diarization challenge. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 2538-2542.

Diwan, A and Vaideeswaran, R and Shah, S and Singh, A and Raghavan, S and Khare, S and Unni, V and Vyas, S and Rajpuria, A and Yarra, C and Mittal, A and Ghosh, PK and Jyothi, P and Bali, K and Seshadri, V and Sitaram, S and Bharadwaj, S and Nanavati, J and Nanavati, R and Sankaranarayanan, K (2021) Mucs 2021: Multilingual and code-switching asr challenges for low resource indian languages. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 351-355.

Yarra, C and Ghosh, PK (2021) Noise robust pitch stylization using minimum mean absolute error criterion. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 3121-3125.

Raj, RGP and Kumar, R and Jayesh, MK and Purushothaman, A and Ganapathy, S and Shaik, MAB (2021) Srib-leap submission to far-field multi-channel speech enhancement challenge for video conferencing. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 326-330.

Udupa, S and Roy, A and Singh, A and Illa, A and Ghosh, PK (2021) Web interface for estimating articulatory movements in speech production from acoustics and text. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 2203-2204.

Sharma, M and Gaddam, N and Umesh, T and Murthy, A and Ghosh, PK (2021) A comparative study of different emg features for acoustics-to-emg mapping. In: 22nd Annual Conference of the International Speech Communication Association, INTERSPEECH 2021, 30 Aug - 03 Sep 2021, Brno, pp. 461-465.

Mannem, R and Gaddam, N and Ghosh, PK (2020) Air-tissue boundary segmentation in real time magnetic resonance imaging video using 3-d convolutional neural network. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25 October 2020 through 29 October 2020, Shanghai; China, pp. 1396-1400.

Krishnamohan, V and Soman, A and Gupta, A and Ganapathy, S (2020) Audiovisual correspondence learning in humans and machines. In: 21st Annual Conference of the International Speech Communication Association, INTERSPEECH, 25 October 2020, Shanghai; China, pp. 4462-4466.

Chetupalli, SR and Ganapathy, S (2020) Context dependent RNNLM for automatic transcription of conversations. In: 21st Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29, October 2020, Shanghai; China;, pp. 886-890.

Ramoji, S and Krishnan, P and Ganapathy, S (2020) Neural PLDA modeling for end-to-end speaker verification. In: 21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020, 25 October 2020, Shanghai; China, pp. 4333-4337.

Singh, A and Illa, A and Ghosh, PK (2020) Attention and encoder-decoder based models for transforming articulatory movements at different speaking rates. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29 October 2020, Shanghai; China, pp. 2907-2911.

Degala, D and Achuth Rao, MV and Krishnamurthy, R and Gopikishore, P and Priyadharshini, V and Prakash, TK and Ghosh, PK (2020) Automatic glottis detection and segmentation in stroboscopic videos using convolutional networks. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25 October 2020 through 29 October 2020, Shanghai; China, pp. 4801-4805.

Sharma, N and Krishnan, P and Kumar, R and Ramoji, S and Chetupalli, SR and Nirmala, R and Kumar Ghosh, P and Ganapathy, S (2020) Coswara - A database of breathing, cough, and voice sounds for COVID-19 diagnosis. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25 October 2020 through 29 October 2020, Shanghai; China, pp. 4811-4815.

Purushothaman, A and Sreeram, A and Kumar, R and Ganapathy, S (2020) Deep learning based dereverberation of temporal envelopes for robust speech recognition. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29 October 2020, Shanghai; China, pp. 1688-1692.

Agrawal, P and Ganapathy, S (2020) Robust raw waveform speech recognition using relevance weighted representations. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29 October 2020, Shanghai; China, pp. 1649-1653.

Illa, A and Ghosh, PK (2020) Speaker conditioned acoustic-to-articulatory inversion using x-vectors. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29 October 2020, Shanghai; China, pp. 1376-1380.

Mannem, R and Hima Jyothi, R and Illa, A and Ghosh, PK (2020) Speech rate task-specific representation learning from acoustic-articulatory data. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29 October 2020, Shanghai; China, pp. 2892-2896.

Naini, AR and Satyapriya, M and Ghosh, PK (2020) Whisper activity detection using CNN-LSTM based attention pooling network trained for a speaker identification task. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29 October 2020, Shanghai; China, pp. 2922-2926.

Purohit, T and Ghosh, PK (2020) An investigation of the virtual lip trajectories during the production of bilabial stops and nasal at different speaking rates. In: 21st Annual Conference of the International Speech Communication Association, INTERSPEECH, 25-29 October 2020, Shanghai; China, pp. 1401-1405.

Ramanathi, MK and Yarra, C and Ghosh, PK (2019) ASR inspired syllable stress detection for pronunciation evaluation without using a supervised classifier and syllable level features. In: 0th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 September 2019 - 19 September 2019, Graz, pp. 924-928.

Mannem, R and Mallela, J and Illa, A and Ghosh, PK (2019) Acoustic and articulatory feature based speech rate estimation using a convolutional dense neural network. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 929-933.

Malhotra, K and Bansal, S and Ganapathy, S (2019) Active learning methods for low resource end-to-end speech recognition. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 2215-2219.

Padi, B and Mohan, A and Ganapathy, S (2019) Attention based hybrid I-vector BLSTM model for language recognition. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 1263-1267.

Suhas, BN and Patel, D and Rao, N and Belur, Y and Reddy, P and Atchayaram, N and Yadav, R and Gope, D and Ghosh, PK (2019) Comparison of speech tasks and recording devices for voice based automatic classification of healthy subjects and patients with amyotrophic lateral sclerosis. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 4564-4568.

Saha, A and Yarra, C and Ghosh, PK (2019) Low resource automatic intonation classification using gated recurrent unit (GRU) networks pre-trained with synthesized pitch patterns. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 959-963.

Dhiman, JK and Adiga, N and Seelamantula, CS (2019) On the suitability of the Riesz spectro-temporal envelope for WavENet based speech synthesis. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 944-948.

Yarra, C and Srinivasan, A and Gottimukkala, S and Ghosh, PK (2019) SPIRE-fluent: A self-learning app for tutoring oral fluency to second language English learners. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 968-969.

Agrawal, P and Ganapathy, S (2019) Unsupervised raw waveform representation learning for ASR. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 3451-3455.

Naini, AR and Achuth Rao, MV and Ghosh, PK (2019) Whisper to neutral mapping using cosine similarity maximization in i-vector space for speaker verification. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 4340-4344.

Sudhakara, S and Ramanathi, MK and Yarra, C and Ghosh, PK (2019) An improved goodness of pronunciation (GoP) measure for pronunciation evaluation with DNN-HMM system considering HMM transition probabilities. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 954-958.

Illa, A and Ghosh, PK (2019) An investigation on speaker specific articulatory synthesis with speaker independent articulatory inversion. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 121-125.

Ryant, N and Church, K and Cieri, C and Cristia, A and Du, J and Ganapathy, S and Liberman, M (2019) The second dihard diarization challenge: Dataset, task, and baselines. In: 20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019, 15 - 19 September 2019, Graz, pp. 978-982.

Murthy, HA and Alku, P and Rao, P and Ghosh, PK (2018) Message from the technical program chairs. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2018 .

Kumar, N and Das, RK and Jelil, S and Dhanush, BK and Kashyap, H and Murty, KSR and Ganapathy, S and Sinha, R and Prasanna, SRM (2017) IITG-indigo system for NIST 2016 SRE challenge. In: 18th Annual Conference of the International Speech Communication Association, 20 August 2017, Stockholm, pp. 2859-2863.

Ghosh, P and Narwekar, A (2017) PRAV: A phonetically rich audio visual corpus. In: 18th Annual Conference of the International Speech Communication Association, 20 - 24 August 2017, Stockholm, pp. 3747-3751.

Suresh, AK and Srinivasa Raghavan, KM and Ghosh, PK (2017) Phoneme state posteriorgram features for speech based automatic classification of speakers in cold and healthy condition. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 3462-3466.

Agrawal, P and Ganapathy, S (2017) Speech representation learning using unsupervised data-driven modulation filtering for robust ASR. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 2446-2450.

Karthik, GR and Ghosh, PK (2017) Subband selection for binaural speech source localization. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 1929-1933.

Vijayan, K and Dhiman, JK and Seelamantula, CS (2017) Time-frequency coherence for periodic-aperiodic decomposition of speech signals. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 329-333.

Achuth Rao, MV and Yadav, S and Ghosh, PK (2017) A dual source-filter model of snore audio for snorer group classification. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 3502-3506.

Fotedar, G and Ghosh, PK (2017) An information theoretic analysis of the temporal synchrony between head gestures and prosodic patterns in spontaneous speech. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 157-161.

Ananthapadmanabha, TV and Ramakrishnan, AG and Sharma, S (2017) An objective critical distance measure based on the relative level of spectral valley. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 641-644.

Nisha Meenakshi, G and Ghosh, PK (2017) A robust Voiced/Unvoiced phoneme classification from whispered speech using the 'color' of whispered phonemes and Deep Neural Network. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 503-507.

Dhiman, JK and Adiga, N and Seelamantula, CS (2017) A spectro-temporal demodulation technique for pitch estimation. In: 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 20 - 24 August 2017, Stockholm, pp. 2306-2310.

This list was generated on Sat Apr 20 19:32:43 2024 IST.