Papers in refereed conference proceedings

+ Papers in refereed conference proceedings

«Named entity recognition for audio de-identification»

Baril, Guillaume, Cardinal, Patrick and Koerich, Alessandro Lameiras.”
International Joint Conference on Neural Networks (IJCNN) (Padua, Italy, July 18-23, 2022)Institute of Electrical and Electronics Engineers Inc.. 2022.

«Towards robust speech-to-text adversarial attack»

Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich”
47th IEEE International Conference on Acoustics, Speech, and Signal Processing (Singapore, Singapore, May 23-27, 2022)p. 2869-2873.Institute of Electrical and Electronics Engineers Inc.. 2022.

«Class-conditional defense GaN against end-to-end speech attacks»

Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich”
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Toronto, ON, Canada – En ligne, June 06-11,, 2021)p. 2565-2569.Institute of Electrical and Electronics Engineers Inc.. 2021.

«Cross attentional audio-visual fusion for dimensional emotion recognition»

R.Gnana Praveen, Eric Granger, Patrick Cardinal”
16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021) (Jodhpur, India, Dec. 15-18, 2021)Institute of Electrical and Electronics Engineers Inc.. 2021.

«RADARSAT-2 Synthetic-Aperture radar land cover segmentation using deep convolutional neural networks»

Mirmohammad Saadati, Marco Pedersoli, Patrick Cardinal, Peter Oliver”
Pattern Recognition. ICPR International Workshops and Challenges, Virtual Event, January 10-15, 2021, Proceedings Part VIII (Milan, Italy, Jan. 10-15, 2021) p. 106-117.Springer. 2021.

«Deep weakly supervised domain adaptation for pain localization in videos»

Gnana R.Praveen, Eric Granger, Patrick Cardinal”
15th IEEE International Conference on Automatic Face and Gesture Recognition (FG) (Buenos Aires, Argentina, Nov. 16-20, 2020)p. 473-480.IEE Computer Society 2020.

«Adversarially training for audio classifiers»

Raymel Alfonso Sallo, Mohammad Esmaeilpour, Patrick Cardinal”
25th International Conference on Pattern Recognition (ICPR) (Milan, Italy, Jan. 10-15, 2021)p. 9569-9576.Piscataway, NJ, USA : IEEE. 2020.

«Emotion recognition with spatial attention and temporal softmax pooling»

Masih Aminbeidokhti, Marco Pedersoli, Patrick Cardinal, Eric Granger”
Image Analysis and Recognition : 16th International Conference, ICIAR : Proceedings (Waterloo, ON, Canada, Aug. 27-29, 2019) p. 323-331.Cham, Switzerland : Springer International Publishing. 2019.

«Emotion recognition using fusion of audio and video features»

Juan D.S. Ortega, Patrick Cardinal, Alessandro L. Koerich”
IEEE International Conference on Systems, Man and Cybernetics (SMC) (Bari, Italy, Oct. 06-09, 2019)p. 3847-3852.Institute of Electrical and Electronics Engineers Inc.. 2019.

«Classification of nonverbal human produced audio events: A pilot study»

Rachel E.Bouserhal, Philippe Chabot, Milton Sarria-Paja, Patrick Cardinal, Jérémie Voix”
19th Annual Conference of the International Speech Communication (INTERSPEECH 2018) (Hyderabad, India, Sept. 02-06, 2018)p. 1512-1516.International Speech Communication Association 2018.

«MyOrtho – A vocal coach application with visual feed-back for monitoring and storing of patient progress in a home environment»

I.Verduyckt, P. Cardinal, A. Loubnani, A. Alpan”
10th International Workshop Models and Analysis of Vocal Emissions for Biomedical Applications (Firenze, Italy, Dec. 13-15, 2017)p. 31-34.Firenze University Press. 2017.

«Automatic dialect detection in Arabic broadcast speech»

Ahmed Ali, Najim Dehak, Patrick Cardinal, Sameer Khurana, Sree Harsha Yella, James Glass, Peter Bell, Steve Renals”
17th Annual Conference of the International Speech Communication Association, (INTERSPEECH) (San Francisco, CA, USA, Sept. 08-16, 2016)p. 2934-2938.Baixas, France : International Speech and Communication Association 2016.

«PHYSIOSTRESS: A multimodal corpus of data on acute stress and physiological activation»

Patrice Boucher, Pierre Dufour, Pierrich Plusquellec, Najim Dehak, Pierre Dumouchel, Patrick Cardinal”
Workshop on Multimodal Corpora : Computer vision and language processing (MMC 2016) (Portoroz, Slovenia, May 23-28, 2016)p. 45-48.European Language Resources Association (ELRA). 2016.

«Native language detection using the i-vector framework»

Mohammed Senoussaoui, Patrick Cardinal, Najim Dehak, Alessandro L.Koerich”
17th Annual Conference of the International Speech Communication Association (INTERSPEECH) (San Francisco, CA, USA, Sept. 08-16, 2016)p. 2398-2402.Baixas, France : International Speech and Communication Association 2016.

«Audio quotation marks for natural language understanding»

Simon Boutin, Réal Tremblay, Patrick Cardinal, Doug Peters, Pierre Dumouchel”
INTERSPEECH 2015. 16th Annual Conference of the International Speech Communication Association (Dresden, Germany, Sept. 6-10, 2015)p. 1349-1352.International Speech Communication Association. 2015.

«ETS System for AV+EC 2015 Challenge»

Patrick Cardinal, Najim Dehak, Alessandro Koerich Lameiras, Jahangir Alam, Patrice Boucher”
Proceedings of the 5th International Workshop on Audio/Visual Emotion Challenge (Brisbane, Australia, Oct. 26-30, 2015)p. 17-23.ACM. 2015.

«Speaker adaptation using the I-vector technique for bottleneck features»

Patrick Cardinal, Najim Dehak, Yu Zhang, James Glass”
INTERSPEECH 2015. 16th Annual Conference of the International Speech Communication Association (Dresden, Germany, Sept. 6-10, 2015)p. 2867-2871.International Speech Communication Association. 2015.

«A complete KALDI recipe for building Arabic speech recognition systems»

Ahmed Ali, Yifan Zhang, Patrick Cardinal, Najim Dehak, Stephan Vogel, James Glass”
2014 IEEE Spoken Language Technology Workshop (STL) (South Lake Tahoe, NV, USA, Dec. 7-10, 2014)p. 525-529.IEEE. 2014.

«Recent advances in ASR applied to an Arabic transcription system for Al-Jazeera»

Patrick Cardinal, Ahmed Ali, Najim Dehak, Yu Zhang, Tuka Al Hanai, Yifan Zhang, James R.Glass, Stephan Vogel”
INTERSPEECH 2014. 15th Annual Conference of the International Speech Communication Association (Singapore, Singapore, Sept. 14-18, 2014)p. 2088-2092.International Speech Communication Association. 2014.

«The A* speech recognition system on parallel architectures»

Patrick Cardinal, Gilles Boulianne, Pierre Dumouchel”
2012 11th International Conference on Information Science, Signal Processing and their Applications(ISSPA) (Montreal, QC, Canada, July 2-5, 2012)p. 108-113.Washington, DC : IEEE Computer Society. 2012.

«Using A* for the parallelization of speech recognition systems»

Patrick Cardinal, Gilles Boulianne, Pierre Dumouchel”
2012 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (Kyoto, Japan, Mar. 25-30, 2012)p. 4433-4436.Piscataway, NJ : Institute of Electrical and Electronics Engineers Inc.. 2012.

«Content-based advertisement detection»

Patrick Cardinal, Vishwa Gupta, Gilles Boulianne”
INTERSPEECH 2010. 11th Annual Conference of the International Speech Communication Association (Chiba, Makuhari, Japan, Sept. 26-30, 2010)p. 2214-2217.International Speech Communication Association. 2010.

«CRIM\’s content-based audio copy detection system for TRECVID 2009»

Vishwa Gupta, Gilles Boulianne, Patrick Cardinal”
2010 International Workshop on Content-Based Multimedia Indexing (CBMI) (Grenoble, France, June 23-25, 2010)IEEE. 2010.

«Content-based audio copy detection using nearest-neighbor mapping»

Vishwa Gupta, Gilles Boulianne, Patrick Cardinal”
2010 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP) (Dallas, TX, USA, Mar. 14-19, 2010)p. 261-264.IEEE. 2010.

«Real-time correction of closed-captions»

Patrick Cardinal, Gilles Boulianne”
INTERSPEECH 2009 – 10th Annual Conference of the International Speech Communication Association (Brighton, UK, Sept. 6-10, 2009)p. 1447-1450.International Speech and Communication Association. 2009.

«Using parallel architectures in speech recognition»

Patrick Cardinal, Pierre Dumouchel, Gilles Boulianne”
INTERSPEECH 2009 – 10th Annual Conference of the International Speech Communication Association (Brighton, UK, Sept. 6-10, 2009)p. 3039-3042.International Speech and Communication Association. 2009.

«CRIM\’s content-based copy detection system for TRECVID»

Maguelonne Héritier, Vishwa Gupta, Langis Gagnon, Gilles Boulianne, Samuel Foucher, Patrick Cardinal”
2009 TREC Video Retrieval Evaluation Notebook Papers (Gaithesburg, MD, USA, Nov. 16, 2009)National Institute of Standards and Technology. 2009.

«GPU accelerated acoustics likelihood computations»

Patrick Cardinal, Pierre Dumouchel, Gilles Boulianne, Michel Comeau”
9th Annual Conference of the International Speech Communication Association (INTERSPEECH) (Brisbane, Australia, Sept. 22-26, 2008)p. 964-967.Bonn, Germany : International Speech Communication Association. 2008.

«Real-time correction of closed captions»

P.Cardinal, G. Boulianne, M. Comeau, M. Boisvert”
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (Prague, Czech Republic, June 24-29, 2007)p. 113-116.Association for Computational Linguistics (ACL). 2007.

«Computer-assisted closed-captioning of live TV broadcasts in French»

G.Boulianne, J.-F. Beaumont, M. Boisvert, J. Brousseau, P. Cardinal, C. Chapdelaine, M. Comeau, P. Ouellet, F. Osterrath”
INTERSPEECH 2006 : ICSLP ; Proceedings of the Ninth International Conference on Spoken Language Processing (Pittsburgh, PA, USA, Sept. 17-21, 2006)p. 273-276.International Speech and Communication Association. 2006.

«Segmentation of recordings based on partial transcriptions»

Patrick Cardinal, Gilles Boulianne, Michel Comeau”
Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech\’2005-Eurospeech) (Lisbon, Portugal, Sept. 4-8, 2005)p. 3345-3348.International Speech and Communication Association. 2005.

«Automatic segmentation of film dialogues into phonemes and graphemes»

Gilles Boulianne, Jean-François Beaumont, Patrick Cardinal, Michel Comeau, Pierre Ouellet, Pierre Dumouchel”
Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003) (Geneva, Switzerland, Sept. 1-4, 2003)p. 1241-1244.International Speech and Communication Association. 2003.

«Automated closed-captioning of live TV broadcast news in French»

Julie Brousseau, Jean-François Beaumont, Gilles Boulianne, Patrick Cardinal, Claude Chapdelaine, Michel Comeau, Frédéric Osterrath, Pierre Ouellet”
Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003) (Geneva, Switzerland, Sept. 1-4, 2003)p. 1245-1248.International Speech and Communication Association. 2003.

«Disambiguation of finite-state transducers»

N.Smaili, P. Cardinal, G. Boulianne, P. Dumouchel”
Proceedings of the 19th International Conference on Computational Linguistics (COLING2002) (Taipei, Taiwan, Aug. 26-30, 2002)Association for Computational Linguistics (ACL). 2002.

«Deep domain adaptation for ordinal regression of pain intensity estimation using weakly-labelled videos»

Rajasekar, Gnana Praveen and Granger, Eric and Cardinal, Patrick”

arXiv 2020.

«From sound representation to model robustness»

Esmaeilpour, Mohammad and Cardinal, Patrick and Koerich, Alessandro Lameiras”

arXiv 2020.

«Improving stability of LS-GANs for audio and speech signals»

Esmaeilpour, Mohammad and Sallo, Raymel Alfonso and St-Georges, Olivier and Cardinal, Patrick and Koerich, Alessandro Lameiras”

arXiv 2020.

«Adversarially training for audio classifiers»

Sallo, Raymel Alfonso and Esmaeilpour, Mohammad and Cardinal, Patrick”

arXiv 2020.

«Emotion Recognition with Spatial Attention and Temporal Softmax Pooling»

Aminbeidokhti, Masih and Pedersoli, Marco and Cardinal, Patrick and Granger, Eric”

arXiv 2019.

«Bag-of-audio-words based on autoencoder codebook for continuous emotion prediction»

Senoussaoui, Mohammed and Cardinal, Patrick and Koerich, Alessandro Lameiras”

arXiv 2019.

«Universal adversarial audio perturbations»

Abdoli, Sajjad and Hafemann, Luiz G. and Rony, Jérôme and Ayed, Ismail Ben and Cardinal, Patrick and Koerich, Alessandro L.”

arXiv 2019.

«Deep weakly-supervised domain adaptation for pain localization in videos»

Gnana, Praveen R. and Granger, Eric and Cardinal, Patrick”

arXiv 2019.

«Multimodal fusion with deep neural networks for audio-video emotion recognition»

Ortega, Juan D.S. and Senoussaoui, Mohammed and Granger, Eric and Pedersoli, Marco and Cardinal, Patrick and Koerich, Alessandro L.”

arXiv 2019.

«End-to-end environmental sound classification using a 1D convolutional neural network»

Abdoli, Sajjad and Cardinal, Patrick and Koerich, Alessandro Lameiras”

arXiv 2019.

«Emotion recognition using fusion of audio and video features»

Ortega, Juan D.S. and Cardinal, Patrick and Koerich, Alessandro L.”

arXiv 2019.

«A robust approach for securing audio classification against adversarial attacks»

Esmaeilpour, Mohammad and Cardinal, Patrick and Koerich, Alessandro Lameiras”

arXiv 2019.

«Unsupervised feature learning for environmental sound classification using weighted cycle-consistent generative adversarial network»

Esmaeilpour, Mohammad and Cardinal, Patrick and Koerich, Alessandro L.”

arXiv 2019.

«Named entity recognition for audio de-identification»

«Towards robust speech-to-text adversarial attack»

«Class-conditional defense GaN against end-to-end speech attacks»

«Cross attentional audio-visual fusion for dimensional emotion recognition»

«RADARSAT-2 Synthetic-Aperture radar land cover segmentation using deep convolutional neural networks»

«Deep weakly supervised domain adaptation for pain localization in videos»

«Adversarially training for audio classifiers»

«Emotion recognition with spatial attention and temporal softmax pooling»

«Emotion recognition using fusion of audio and video features»

«Classification of nonverbal human produced audio events: A pilot study»

«MyOrtho – A vocal coach application with visual feed-back for monitoring and storing of patient progress in a home environment»

«Automatic dialect detection in Arabic broadcast speech»

«PHYSIOSTRESS: A multimodal corpus of data on acute stress and physiological activation»

«Native language detection using the i-vector framework»

«Audio quotation marks for natural language understanding»

«ETS System for AV+EC 2015 Challenge»

«Speaker adaptation using the I-vector technique for bottleneck features»

«A complete KALDI recipe for building Arabic speech recognition systems»

«Recent advances in ASR applied to an Arabic transcription system for Al-Jazeera»

«The A* speech recognition system on parallel architectures»

«Using A* for the parallelization of speech recognition systems»

«Content-based advertisement detection»

«CRIM\’s content-based audio copy detection system for TRECVID 2009»

«Content-based audio copy detection using nearest-neighbor mapping»

«Real-time correction of closed-captions»

«Using parallel architectures in speech recognition»

«CRIM\’s content-based copy detection system for TRECVID»

«GPU accelerated acoustics likelihood computations»

«Real-time correction of closed captions»

«Computer-assisted closed-captioning of live TV broadcasts in French»

«Segmentation of recordings based on partial transcriptions»

«Automatic segmentation of film dialogues into phonemes and graphemes»

«Automated closed-captioning of live TV broadcast news in French»

«Disambiguation of finite-state transducers»

«Deep domain adaptation for ordinal regression of pain intensity estimation using weakly-labelled videos»

«From sound representation to model robustness»

«Improving stability of LS-GANs for audio and speech signals»

«Adversarially training for audio classifiers»

«Emotion Recognition with Spatial Attention and Temporal Softmax Pooling»

«Bag-of-audio-words based on autoencoder codebook for continuous emotion prediction»

«Universal adversarial audio perturbations»

«Deep weakly-supervised domain adaptation for pain localization in videos»

«Multimodal fusion with deep neural networks for audio-video emotion recognition»

«End-to-end environmental sound classification using a 1D convolutional neural network»

«Emotion recognition using fusion of audio and video features»

«A robust approach for securing audio classification against adversarial attacks»

«Unsupervised feature learning for environmental sound classification using weighted cycle-consistent generative adversarial network»

«Detection of adversarial attacks and characterization of adversarial subspace»

«Speaker sincerity detection based on covariance feature vectors and ensemble methods»

«Classification of nonverbal human produced audio events: A pilot study»

«Native language detection using the I-vector framework»

«Automatic dialect detection in Arabic broadcast speech»

«Speaker adaptation using the i-vector technique for bottleneck features»

«Audio quotation marks for natural language understanding»

«Recent advances in ASR Applied to an Arabic transcription system for Al-Jazeera»

«Using A* for the parallelization of speech recognition systems»

«The A* speech recognition system on parallel architectures»

«Content-based advertisement detection»

«Content-based audio copy detection using nearest-neighbor mapping»

«Using parallel architectures in speech recognition»

«Real-time correction of closed-captions»

«GPU accelerated acoustic likelihood computations»

«Computer-assisted closed-captioning of live TV broadcasts in French»

«Segmentation of recordings based on partial transcriptions»

«Automated closed-captioning of live TV broadcast news in French»

«Automatic segmentation of film dialogues into phonemes graphemes»

About Us

Research & Innovation

News & Events

Contact Us