Building end to end machine learning products.

Joined June 2012
288 Photos and videos
Sambit Sekhar retweeted
Thanks to @silogenai (@AMDSiloAI) & Aku Rouhe for releasing Qwen1.5 7B Odia Instruct โ€” advancing LLMs for the Odia language! Trained on Odia data fine-tuned with 5 Odia instruct sets released by @OdiaGenAI & 1 English set. ๐Ÿ‘‰ Model: huggingface.co/silogen/Qwen1โ€ฆ
3
8
234
Sambit Sekhar retweeted
๐Ÿง  GenAI Quiz: Test Your Knowledge & Win! ๐Ÿ“… Date: 3rdโ€“5th July 2025 ๐Ÿ“ Location: Bhubaneswar, India (Hybrid Mode) ๐ŸŒodiagenai.org/workshop-2025 ๐Ÿš€ Get ready to challenge your brain and showcase your expertise at the GenAI Quiz, part of the Three-Day GenAI Workshop
3
5
231
Sambit Sekhar retweeted
๐Ÿ’ป GenAI Hackathon: Build the Future with Generative AI! ๐Ÿ“… Date: 3rdโ€“5th July 2025 ๐Ÿ“ Location: Bhubaneswar, Odisha, India (Hybrid Mode) ๐ŸŒ Workshop Page: odiagenai.org/workshop-2025 ๐Ÿ“ Register here: (to be announced)
2
7
184
Sambit Sekhar retweeted
๐Ÿš€ GenAI Pitch Session: Share Your Big Idea in 5 Minutes! ๐Ÿ“… Date: 3rdโ€“5th July 2025 ๐Ÿ“ Location: Bhubaneswar, Odisha, India (Hybrid Mode) ๐ŸŒ Workshop Page: odiagenai.org/workshop-2025 ๐Ÿ“ Register here: forms.gle/CrMMNqReJg5o5jmE7
3
5
154
Sambit Sekhar retweeted
We are excited to unveil the speaker lineup for the upcoming three-day Generative AI Workshop, jointly organized by @OdiaGenAI , AHRC @iitbbs , and the SCA, KIIT - @KIITUniversity Workshop page: odiagenai.org/workshop-2025 Registration:forms.gle/bpt78iiVKe1BV3j18
2
7
149
Sambit Sekhar retweeted
We are delighted to announce that Prof. Amit Sheth will be the keynote speaker at the upcoming three-day Generative AI Workshop, jointly organized by @OdiaGenAI, the AI & HPC Research Center (AHRC) at @iitbbs , and the School of Computer Application at @KIITUniversity.
2
5
118
Sambit Sekhar retweeted
๐ŸŒŸ Glad to share our @indo_nlp workshop #coling2025 paper: OVQA: A Dataset for Visual Question Answering and Multimodal Research in Odia Language ๐Ÿ“„ Paper: coling-2025-proceedings.s3.uโ€ฆ ๐Ÿค– HF Dataset:huggingface.co/datasets/odiaโ€ฆ ๐ŸŽ‰ Congratulations to all the authors! @sambit_ai

2
3
208
Sambit Sekhar retweeted
Excited to share that our open-source initiative @OdiaGenAI by our team @sambit_ai @swateekj @soumendrak_ @Babunisatya has been featured by @timesofindia & @timestechies! ๐ŸŽ‰ Check out the article: timesofindia.indiatimes.com/โ€ฆ Thank you, @timestechies, for the coverage!
1
6
11
291
Sambit Sekhar retweeted
We're excited to share our innovative project addressing the critical challenge of deploying deep learning models in resource-constrained environments. ๐ŸŒŸ For more details, check the project page: odiagenai.org/fedcom
2
3
186
Sambit Sekhar retweeted
We are delighted to welcome our new researchers to the OdiaTreeBank Project โ€“ Odia Bhasa Bruksha (เฌ“เฌกเฌฟเฌ† เฌญเฌพเฌทเฌพ เฌฌเญƒเฌ•เญเฌท) โœจ Sourav Kumar Behera โœจ Srustiprava Satapathy โœจ Nirmal Naik โœจ Shashikanta Sahoo Website: odiagenai.org/odiatreebank
1
1
7
222
Sambit Sekhar retweeted
To address the lack of the Indic LLM dataset for pre-training, we have released a 300 million Odia tokens dataset for LLM pre-training. For more details, check out our blog post: ๐Ÿ“„ Blog: odiagenai.org/blog/odiagenaiโ€ฆ ๐Ÿ”— HF Link: huggingface.co/datasets/Odiaโ€ฆ
2
9
432
Sambit Sekhar retweeted
We had an outstanding quarter! From organizing an international workshop to launching a unique benchmarking project, all while keeping up with our regular deliveriesโ€”our mills are always running. Read our second newsletter for details: odiagenai.substack.com/p/newโ€ฆ
3
3
203
Sambit Sekhar retweeted
Had a great time talking at the inaugural ceremony of @OdiaGenAI 3 day workshop hosted with support by @KIITUniversity ๐Ÿ”ฅ. Thank you @achyuta_samanta @Saranjit72 @Shantipriyapar3 @Babunisatya Sir. Some great talks are lined up and here is the schedule odiagenai.org/workshop-2024 Also check out the cool works from the odia gen ai website odiagenai.org/
2
3
27
3,087
Sambit Sekhar retweeted
We're excited to announce a poster session at our upcoming Generative AI workshop, focusing on Indic languages Researchers in this field are invited to showcase their work and compete for the Best Poster award! For more details, visit our workshop page: odiagenai.org/workshop-2024
2
2
208
Sambit Sekhar retweeted
Registration is in full swing for our Generative AI workshop! Join us for an exciting series of events over three days. Link: odiagenai.org/workshop-2024 @Shantipriyapar3 @sambit_ai @KIITUniversity @soumendrak_ @odias_in_ai @guneetsk99 @ak_panda
3
5
826
Sambit Sekhar retweeted
In the new series of Small Language Models (SLM) for Indic languages, @OdiaGenAI released Hindi-Gemma-2B-instruct, a 2Billion SFT with 187k large instruction sets in Hindi. Model: huggingface.co/OdiaGenAI-LLMโ€ฆ Dataset: huggingface.co/datasets/guneโ€ฆ Blog: odiagenai.org/blog/odiagenaiโ€ฆ
1
1
211
Sambit Sekhar retweeted
Our Bangalore-based volunteers - @swateekj & @soumendrak_ , organized a small meet & greet with one of our fellow researchers @guneetsk99, who was visiting Bangalore for a while. We hope to be regular in doing meetups, across cities in India & outside. Join us in the next one!
3
10
408
Sambit Sekhar retweeted
One step forward, we upgraded our RAG-based AI Tutor Acharya developed by the amazing @OdiaGenAI team for self-learning subjects in Hindi. Based on our fine-tuned Mistral-7b Hindi LLM. @Shantipriyapar3 @sambit_ai @soumendrak_ @guneetsk99 youtube.com/watch?v=IlgcKK9Xโ€ฆ
1
3
6
538
Sambit Sekhar retweeted
๐Ÿ”ฅ ๐‘๐ž๐ฅ๐ž๐š๐ฌ๐ข๐ง๐  ๐ˆ๐ง๐๐ข๐œ ๐†๐ž๐ฆ๐ฆ๐š 7๐/2๐ ๐ˆ๐ง๐ฌ๐ญ๐ซ๐ฎ๐œ๐ญ๐ข๐จ๐ง ๐ญ๐ฎ๐ง๐ž๐ ๐ฆ๐จ๐๐ž๐ฅ ๐จ๐ง 9 ๐ˆ๐ง๐๐ข๐š๐ง ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž๐ฌ โ€” ๐๐š๐ฏ๐š๐ซ๐š๐ฌ๐š ๐Ÿš€ We are thrilled to share ๐ŸŒŸ ๐๐š๐ฏ๐š๐ซ๐š๐ฌ๐š, a Gemma 7B & 2B instruction-tuned models in 9 Indian Languages - Perhaps this is the first Indic open instruction-tuned model trained in 9 Indian languages additionally English included. ๐Ÿ”ฅ๐๐š๐ฏ๐š๐ซ๐š๐ฌ๐š is a Gemma 7B & 2B SFT model using Gemma 7B & 2B base models. Last week we released the Telugu Gemma 7B/ 2B SFT model using curated Telugu datasets from Telugu LLM Labs and we observed really good performance compared to Llama2-based models. ๐ŸŒ So, we thought why donโ€™t we scale up Gemma 7B & 2B models to multiple Indian languages and we went ahead with testing tokenizers of the following 9 Indian Languages and English Language. 1. Hindi 2. Telugu 3. Tamil 4. Malayalam 5. Kannada 6. Gujarati 7. Bengali 8. Punjabi 9. Odia 10. English โœจ We found the model to have the following capabilities: (X represents any other Indian language) 1. Instruction and Input in Native X language, Output in Native X language. 2. Instruction and Input in English language prompted to respond in Native X language, Output in Native X language. 3. Instruction in Native X language, Input in English language, and Output in Native X language. ๐Ÿ“Š๐“๐ซ๐š๐ข๐ง๐ข๐ง๐  ๐ƒ๐ž๐ญ๐š๐ข๐ฅ๐ฌ: 1. Single A100 machine which took approx. 36 hours for the 7B model and 15 hours for the 2B model. 2. Platform: E2E Networks Limited ๐Ÿ“ We have shared details on datasets, Examples of Reasoning, Translation, and Question Answering with Context in our blog post. ๐Ÿค The work would not have been possible without huge community effort from different languages and a huge shout out to each one of their work over the past few months showcasing the true OSS power. Following are details of contributors for the languages: 1. Hindi: @SarvamAI 2. Telugu: Telugu LLM Labs 3. Tamil: @abhinand58 4. Kannada: @adarshxs and the team at Tensonic 5. Malayalam: Vishnu Prasad J 6. Odia: @OdiaGenAI 7. Gujarati: Adarsh Shirawalmath and the team at Tensonic 8. Punjabi: HydraIndicLM 9. Bengali: HydraIndicLM ๐Ÿ‘ Special thanks toย @unslothai for simplifying the training and inference processes! ๐Ÿ”œ As we release these models, the next step is to create romanized datasets and we are working hard on evaluation datasets so that we can benchmark and improve on top of it. ๐Ÿค This work is done in collaboration withย @ramsri_gouthamย as part ofย the Telugu LLM Labsย independent initiative. ๐๐ฅ๐จ๐ ๐๐จ๐ฌ๐ญ: shorturl.at/jBQWY ๐‚๐จ๐๐ž๐๐š๐ฌ๐ž: shorturl.at/elxBF
14
47
217
30,703
Sambit Sekhar retweeted
Excited to share Qwen_1.5_Odia_7B, the first pre-trained Odia LLM released from @OdiaGenAI Go through the blog post for the details. Model: huggingface.co/OdiaGenAI-LLMโ€ฆ Dataset: huggingface.co/datasets/Odiaโ€ฆ Blog: odiagenai.org/blog/introduciโ€ฆ
1
6
18
524