A joint study by @poznanAI researchers and Samsung Electronics Polska engineers was presented at @FedCSIS 2024. The paper investigates the impact of augmenting spoken language corpora with domain-specific synthetic samples. arxiv.org/abs/2406.07090
The preprint of our paper "Two Approaches to Diachronic Normalization of Polish Texts" accepted to LaTeCH-CLfL 2024 is now available at arxiv.org/abs/2402.01300#NLProc#DH
In exactly 20 minutes @marekkubis@PSkorzewski@tzietkiewicz and Marcin Sowański will speak about Back Transcription as a Method for Evaluating Robustness of NLU Models to Speech Recognition Errors. Join us online or in person. We start at 11.00 am CET
wmi.amu.edu.pl/zycie-naukowe…
We are participating in the aUPaEU workshop in Turin, Italy, on the presentation of the concept of the Agora. We are a part of a team developing tools for collecting and searching of information for effective cooperation for scientists and HEIs in Europe. @WideningEU@poznanAI
Contrary to conventional adversarial attacks, which aim at determining the samples that deteriorate the model performance under study, our method also takes into consideration samples that change the NLU outcome in other ways.
The robustness criteria that we formulate are then used to construct a model for detecting speech recognition errors that impact the NLU model in the most significant way.
The augmented dataset is used to evaluate natural language understanding models and the outcomes of the evaluation serve as a basis for defining the criteria of NLU model robustness.
The preprint of our paper "Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors" accepted to #EMNLP2023 is now available at arxiv.org/abs/2310.16609#NLProc#VoiceAI#AI
The method that we propose relies on the use of back transcription, a procedure that combines a text-to-speech model with an automatic speech recognition system to prepare a dataset contaminated with speech recognition errors.
Our paper "Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors (@PSkorzewski, Marcin Sowański, @tzietkiewicz) just got accepted to the main track of #EMNLP2023!
And YRRSDS 2023 has started! Looking forward to all the roundtable discussions and keynotes from @verena_rieser@malihealikhani & David Traum - stay tuned 👏👏