Working on video question type for Alfie in addition to multiple-choice, multiple-correct, and free-text.
I've tried a couple of STT models, including Whisper (Groq & OpenAI), AssemblyAI, and Google, but the accuracy is poor, particularly in slightly noisy environments.