Filter
Exclude
Time range
-
Near
🎙️ PhD Opportunity in AI & Audio Signal Processing 🇫🇷 Exciting PhD position at Inria focusing on speech enhancement using distributed microphone arrays, combining acoustics and machine learning. 📍 Location: Strasbourg (Inria Centre – Université de Lorraine) 💰 Salary: €2,300 gross/month 📅 Deadline: July 10, 2026 ⏳ Duration: 3 years 👨‍🏫 Supervisor: Antoine Deleforge 🔬 Project Overview This PhD is part of the French-German ANR-DFG AWESOME project (2026–2029), aiming to dramatically improve speech quality by leveraging distributed (ad-hoc) microphone arrays in real-world environments. The research combines inverse acoustics and cutting-edge machine learning (including diffusion models) to handle dynamic, noisy, and reverberant conditions. 🎯 Key Research Directions: • Microphone array self-localization & calibration • Acoustic scene understanding using reflections • Sound field interpolation for dynamic environments • Multichannel speech enhancement & dereverberation • Diffusion-based generative models for acoustics 💻 What You’ll Do: • Develop algorithms using Python / PyTorch • Conduct experiments and collect acoustic data • Publish research and present at international conferences 👤 Ideal Candidate: • Master’s in ML, signal processing, CS, acoustics, or applied math • Strong Python skills (PyTorch is a plus) • Background in deep learning & signal processing • Interest in audio, acoustics, and research 🌟 Why Apply? • Work at a leading European AI research institute • Interdisciplinary project bridging ML & acoustics • Generous benefits (7 weeks leave, flexible work, training) • International collaboration and strong research exposure 🎧 If you're passionate about AI, sound, and real-world applications like AR/VR, smart devices, and hearing tech—this is a fantastic opportunity! 👉 Apply here: phdscanner.com/opportunities… #PhD #MachineLearning #SignalProcessing #AudioAI #DeepLearning #Inria #ResearchJobs #AI #SpeechProcessing
2
11
670
Made VANTA, A neural target speaker extraction system I’ve been building to isolate one specific voice from the messiest audio recordings. Live demo: vanta.komalpreet.me Code: github.com/Komalpreet2809/Va… Current audio separation tools are a blunt instrument. When dealing with audio files, users often struggle with: • Overlapping voices • Loud background chatter • Complex room acoustics • Standard noise cancellation tools that blindly suppress "noise" • Systems that don't know who to focus on when multiple people are speaking Most AI audio tools either act like a black box, aggressively muffle everything, or leave the target speaker sounding metallic and robotic. I wanted to build something different. Vanta is an informed audio separator. Instead of guessing what to suppress, it uses a 5-second reference clip of your target speaker to learn their exact voice fingerprint. It then scans the messy mixture and extracts only that person, returning a crystal-clear track of their voice, plus a residue track of everything it removed. What it can do: • Ingest a 5-second reference voice fingerprint • Isolate the target speaker from highly noisy mixtures • Mask out interfering voices (even at similar volumes) • Preserve the natural phase of the audio (no STFT/robotic artifacts) • Generate a residue track of the removed noise/speakers • Operate robustly across different simulated room environments The main principle behind the project is: More signal. More informed extraction. Zero metallic artifacts. Less blind noise cancellation. The Tech stack: • PyTorch for the core ML architecture and training • Time-domain 1-D Convolutions to avoid spectrogram artifacts • Frozen ECAPA-TDNN (VoxCeleb) for robust voice fingerprinting • Temporal Convolutional Networks (TCN) with speaker conditioning • FastAPI for the backend API • Next.js Tailwind for the frontend shell • Hugging Face Spaces & Vercel for deployment One of the biggest goals is audio purity. Your isolated audio shouldn't sound like it's trapped in a tin can. • Time-domain architecture: Operates directly on the raw audio waveform. • SI-SDR optimization: Maximizes waveform purity over volume differences. • Continuous conditioning: Voice fingerprint injected at every block to never lose the target speaker. • Explainable separation: Outputs a separate residue track so you can hear exactly what was removed. If you’re an ML engineer, audio researcher, developer, or someone who has felt the pain of noisy recordings and overlapping voices, I’d love your feedback, ideas, issues, PRs, or even just a star ⭐ #OpenSource #MachineLearning #DeepLearning #AudioAI #SpeechProcessing #PyTorch #FastAPI #NextJS #SpeechSeparation #TargetSpeakerExtraction
5
131
🚨 New Paper Alert! speech-emotion: a multilingual & multimodal toolkit for emotion recognition 🎙️💬 ✅ Combines audio text → better performance than unimodal models ✅ Supports Spanish 🇪🇸 & English 🇬🇧 SoftwareX (Elsevier) sciencedirect.com/science/ar… #NLPoc #SpeechProcessing

6
5
176
🎓 PhD Thesis Defended (Apr 27, 2026) Green & Inclusive Speech Processing (bias, sustainability, Indian languages) 🏆 ACL’25 (Outstanding Paper), NAACL’24/’22, LREC’26 🙏 Grateful to @DrAnubhaGupta 🔬 Open to academic opportunities #SpeechProcessing #ResponsibleAI #IIITD
2
2
147
🚨 Opportunity: Junior Research Fellow (JRF) in Speech & Language Processing 🎙️🧠 We are looking for motivated candidates to join as a Junior Research Fellow (JRF) under an ANRF-funded project at IIIT Dharwad, in the area of Speech and Language Processing. This position is ideal for individuals interested in working on cutting-edge research involving AI/ML for speech and language domains. 🔹 Key Details: Duration: 3 years (fully funded) Stipend: ₹37,000 per month 16% HRA (as per latest ANRF norms) Location: IIIT Dharwad Research areas: Speech processing, language models, AI/ML Opportunity to work on impactful and publishable research 🎓 Academic Opportunities: Candidates interested in pursuing MTech or PhD under this project are strongly encouraged to apply. This position can be aligned with your higher studies. 🔹 Who should apply? Strong background in machine learning / deep learning Experience or interest in speech and language technologies Motivated to pursue research and publications 📩 How to apply: Interested candidates can reach out with their CV and a brief statement of interest at nataraj@iiitdwd.ac.in Please share this with anyone who might be interested. #JRF #PhD #MTech #SpeechProcessing #NLP #AI #ResearchOpportunity #ANRF
1
13
851
🎉 Happy to share that "SpeechMapper: Speech-to-text Embedding Projector for LLMs" has been accepted to #ICASSP2026! This work was done during my internship at NAVER LABS Europe with Marcely Zanon Boito and Ioan Calapodescu. #MultimodalAI #SpeechLM #SpeechProcessing Thread 1/4
1
4
96
Three papers from our unit have been accepted to ICASSP 2026! See you in Barcelona for discussions! #ICASSP2026 #ICASSP #SignalProcessing #AI #SpeechProcessing
1
17
1,725
7 Nov 2025
3/3 Balancing reduces false negatives, vital for screening. Congrats to @XavierSanc2433, PhD student and first author, for his hard work. #MentalHealth #SpeechProcessing #EMD #IMF #MachineLearning
3
36
Huge congratulations to Dr. Shalini Sahay, Professor in the EC Department at SIRT, on publishing her book, "Optimization Analysis of Speech Processing Based Alzheimer's Disease." #BookPublished #AlzheimersResearch #SpeechProcessing #SIRT #ECEngineering
1
2
60
Tomorrow (Sept 25, 11:00–12:00 EST), our #ConversationalAI Reading Group hosts @Themos Stafylakis (Athens Univ. & Omilia). Talk: Advances in Speaker Recognition: Pruning, Deepfake Detection & Learning w/o Temporal Labels Info: poonehmousavi.github.io/rg.h… #AI #SpeechProcessing

3
405
1 Aug 2025
We are proud to share that the paper “IndicSynth: A Large-Scale Multilingual Synthetic Speech Dataset for Low-Resource Indian Languages” from @SBILabIIITD has received the Outstanding Paper Award at ACL 2025 (@aclmeeting), one of the most prestigious conferences in computational linguistics and natural language processing. IndicSynth introduces 4000 hours of synthetic speech from 989 target speakers, including 456 females and 533 males, across 12 Indian languages to facilitate multilingual audio deepfake detection and anti-spoofing research. This recognition is the result of the dedicated work of PhD scholar Divya Sharma, who was guided by Prof. @DrAnubhaGupta. Divya’s technical rigour, clarity of thought, and confident presentation at ACL were central to the success of this work. Her presentation at ACL and engagement during the Q&A demonstrated the calibre of a confident and capable NLP researcher. We also acknowledge the valuable contributions of undergraduate student Vijval Ekbote, whose support strengthened the project. Congratulations to the entire team at SBILab for this important recognition and for driving impactful research in Indian language technologies. #SBILab #IIITD #NLProc #ACL2025NLP #SpeechProcessing #MachineLearning #MultilingualAI #SyntheticSpeech #DeepfakeDetection #ACL2025 #ResearchExcellence
3
14
833
Our pick of the week by @FBKZhihangXie: "Adversarial Speech-Text Pre-Training for Speech Translation" by Chenxuan Liu, Liping Chen, Weitai Zhang, Xiaoxi Li, Peiwang Tang, Mingjia Yu, Sreyan Ghosh, and Zhongyi Ye (ICASSP 2025) #speech #speechprocessing #speechtech #translation
🚀 AdvST: Adversarial training aligns speech and text distributions without parallel data! Combines adversarial learning hidden-state swapping to fix length mismatch & boost low-resource speech translation. ieeexplore.ieee.org/document…
3
215
TCS Research is pleased to be a Silver Sponsor of the Summer School on Speech Signal Processing(S4P). This program offers an in-depth exploration of speech technology and automatic speech recognition. Register here- bit.ly/3G7sfer #TCSResearch #S4P2025 #SpeechProcessing #AI
1
1
178
📢 The Jelinek Summer Workshop on Speech and Language Technology (JSALT 2025) starts today! 👉 More info: eloquenceai.eu/event/jelinek… #ELOQUENCEAI #SpeechProcessing #SpeechTechnology #Workshop
5
33
Excited to share our new paper in B-ENT Journal on how the brain responds to speech across different modalities (audiovisual, auditory, visual) using fNIRS! Explore the localization of cortical responses in normal-hearing adults #fNIRS #SpeechProcessing #Neuroimaging
1
1
423
Our pick of the week by @FBKZhihangXie: "Bridging Speech and Text Foundation Models with ReShape Attention" by @TakatomoKano, @chenwanch1, @shinjiw_at_cmu, et al. (#ICASSP2025) ieeexplore.ieee.org/document… #Speech #FoundationModel #SpeechProcessing
ReShape Attention bridges speech & text models without extra parameters. Achieves 8.5% BLEU in translation by leveraging acoustic cues, outperforming cascade/E2E methods. Efficient & scalable. Check the paper by Kano et al. (2025) at: ieeexplore.ieee.org/stamp/st….
4
7
682
AI Decodes Brain’s Speech & Language Processing" 🧠🎙️🚀 ✅ Brain processes speech & language in a unified hierarchy 🔄 ✅ Whisper ECoG reveal real-time neural mapping 🎙️📊 ✅ AI accurately predicts brain activity across regions 🤖🔬 ✅ Game-changer for neuroscience & AI! 🚀 🔽 Details in the thread below! #NeuroAI #SpeechProcessing #BrainMapping
2
2
11
808
4 Mar 2025
🧠🗣️ New research in eLife explores speech coordination & brain dynamics using intracranial recordings! Read more: doi.org/10.7554/eLife.99547.… #Neuroscience #SpeechProcessing #BrainDynamics

1
3
240
7 Feb 2025
IIT Dharwad is currently inviting applications for a range of project positions in the Speech Processing Lab led by Prof. Mahadeva Prasanna. This opportunity is an ideal for those interested in progressing their careers in Speech and AI, and acquiring practical experience in innovative projects. For further details on the available positions and how to apply, please visit : iitdh.ac.in/other-recruitmen… #IITDharwad #Hiring #AI #SpeechProcessing
2
187