Building AI for Northeast India's underserved languages. Khasi, Garo, Mizo, Pnar & more. Open models on HF. #NEindicLLM

Joined November 2025
13 Photos and videos
Northeast AI Talks is back, and this session is asking a question that doesn't get asked enough: - Is the language data we're building AI on actually ready? - Not just "does it exist" but is it licensed correctly? - Does it cover the right domains? - Is the metadata there? Are privacy concerns addressed. For Indian languages, especially low-resource ones across Northeast India, this gap is real and it's holding back the entire language AI ecosystem. Session 2 brings in Saesha Parekh from @CivicDataLab to lay out a clear framework for what AI-readiness actually looks like and what it takes to get there. đź“… 10th June 2026 | 4:00 PM to 5:00 PM | Virtual Register: forms.gle/3nvvXvVsC8cwXk15A #NortheastAI #LanguageData #AIReadiness #LowResourceNLP #IndianLanguages #NLP #MWireLabs
2
2
57
Small steps toward speech AI that actually understands us. Recently, we finished recording Khasi voices in a studio: real speakers, real language. This is how regional AI gets built. Thank you to our speaker and the team for making it happen. 🙏 #Khasi #NortheastIndia #MWireLabs #SpeechTech #Meghalaya #khasiai
2
16
Both keynotes for NortheastGenAI 2026: this Friday, May 29. 🎙️ Bonaventure F. P. Dossou — McGill & Mila | Masakhane 🎙️ Dr. Prabhat Kumar Bharti — Bennett University Virtual · Open to all Register: northeastgenai.github.io #NortheastGenAI2026 #NortheastIndia #LowResourceNLP #AI
1
30
Northeast AI Talks. A free virtual series on AI, language & culture for Northeast India. Session 1 → 29th April, 11AM IST Dr. Chelmelyne Dhar — I Ktien Pnar: A Basic Sketch @DaniKRaju — Speech Retrieval for an UNWRITTEN LANGUAGE Free. Open to all. 🔗 mwirelabs.com/nat #NortheastAI #LowResourceNLP #NLP #AIforGood
1
1
31
Northeast India. 200 languages. Extraordinary biodiversity. Centuries of knowledge. Almost none of it in global AI research. Can AI change that? NortheastGenAI 2026 is an open experiment. Submissions open now. Deadline May 15. northeastgenai.github.io #northeastgenai #AI #NLP #NortheastIndia
1
29
We invite AI-assisted research across three tracks: T1 - Language, Culture & Heritage T2 - Society, History & Anthropology T3 - AI & Technology for NE India Exploratory work and negative results welcome. Free to submit.
17
AI once ignored Garo (spoken by ~1.2M speakers in Meghalaya). Not because it isn’t advanced; because it had never seen the language. With the VAANI dataset from @artparkindia , we fine‑tuned Whisper. From unusable → deployable. Near‑human transcription, live‑ready. This is what inclusive AI looks like. More languages. More breakthroughs. More voices back on the map. #InclusiveAI #LowResourceLanguages #MWireLabs #Meghalaya
1
37
Garo, Kokborok, Meitei, Mizo, Naga, Nyishi, Assamese, Khasi, Pnar; on a workshop table at #LoResLM #EACL2026 in Rabat. Millions of speakers. Nearly invisible in NLP research. Long road ahead, but being in the room matters. Grateful to the organizers! #NortheastIndia #MWireLabs
1
1
1
79
The table from the workshop proceedings - NE-BERT covering 9 NE languages under Language Modelling. Small step, but it's there. #NEBERT #LowResourceNLP
13
NE-OCR just dropped! A Unified OCR for Northeast India. • 94.99% mean ChA (peak 98.85% on Khasi) • Beats EasyOCR, Tesseract & others on 9/12 pairs • Lowest: Only 17.2ms inference HF: huggingface.co/MWirelabs/ne-… Built in Shillong, Meghalaya #NEOCR #NortheastIndia
2
1
65
The Northeast India AI Research Fellowship! Five fellows. One mission: build AI infrastructure for languages spoken by millions, yet invisible to modern AI. Cohort 01 begins. Follow the progress: mwirelabs.com #AI #NLP #LowResourceLanguages #meghalaya #NortheastIndia
1
56
MWire Labs retweeted
Design and Develop in India. Deliver to the World. Deliver to Humanity.
1,505
5,321
31,348
2,683,987
MWire Labs retweeted
Yesterday, @mwirelabs was recognised by Manipur CM Yumnam Khemchand Singh at the launch of DIGI-SAPNE 2.0, a MeitY-backed startup acceleration program for Northeast India by AIC-SMUTBI. Building enterprise AI & use case-specific SLMs for NE India's. #NortheastIndia #mwirelabs
1
1
67
Launching Northeast Langchive, MWire Labs' open access platform for scholarly books on Northeast India's indigenous languages! NOW OPEN: Call for Chapters "Digital Futures for Indigenous Languages" Deadline: Feb 10, 2026 No fees | Open access northeastlangchive.org #mwirelabs
1
1
33
MWire Labs retweeted
Software: KhasiBERT: Foundational Language Model for Khasi: KhasiBERT is the first open-source AI language model trained exclusively on Khasi-language corpora. Developed by MWire Labs, it supports civic NLP tasks such as translation, summarization, and… dlvr.it/TN56zh
1
1
4
546
23 Nov 2025
KREN-M is now live on Aikosh by IndiaAI. A compact bilingual LLM built for Meghalaya: code-mixing, translation, and civic QA across local languages. Aikosh: aikosh.indiaai.gov.in/home/m… HF: huggingface.co/MWirelabs/Kre… #Mwirelabs #northeastindiaai #indiaaiimpactsummit2026 LLMs
15
18 Nov 2025
In a world racing toward trillion-parameter models, over 40 million people in Northeast India still have zero AI that speaks their language. Khasi, Garo, Mizo, Kokborok and more.. rich histories, and almost invisible to modern LLMs. That silence ends here! #NEindicLLM #mwirelabs
14