Remeinium

Remeinium

3 Photos and videos

Tweets

Pinned Tweet

Remeinium @Remeinium

Mar 27

We measure every token because every token costs real capability. On a 122-million-character multilingual evaluation set, Abugida scripts pay a structural penalty of 2.6–4.4× less effective context than English in frontier LLMs. Infrastructure debt, not a data problem. #AI #WWHO

Remeinium

Remeinium @Remeinium

Mar 30

We invite Engineers, Developers, Researchers, and anyone willing to contribute to this cultural shift for the good of all humanity to join forces with us. The project is fully open-source, and your contribution will help elevate human civilization to its next big stage. #WWHO

Kusal Darshana @SuperposedK

Mar 30

Replying to @nikitabier

The #WWHO tokenization architecture we built at @Remeinium will continue this shift toward AI. It will break most of the linguistic barriers that current LLMs are limited by, allowing everyone to chat with LLMs in their native language while receiving the same quality of responses as they would in English. Costs will be mostly the same for everyone. Check it out on my profile, more updates are yet to come on this!

Kusal Darshana

Remeinium retweeted

Kusal Darshana @SuperposedK

Mar 27

Replying to @OpenAI @deepseek_ai @AIatMeta @Meta

The fix is not “just add more vocabulary slots for Sinhala/Hindi or any Abugida script” It is separating linguistic rules from statistical compression — before the model ever sees the tokens. The architecture exists. We are building it at @Remeinium so every language finally gets the same first-class treatment English already enjoys. What’s the biggest multilingual bottleneck you’ve hit in practice? #AI #LLM #Tokenization #NLP #WWHO

Remeinium

Remeinium @Remeinium

Mar 26

We are working on this @Remeinium. The novel tokenization architecture called WWHO, we have developed will solve this issue effectively. We are exicted to launch it as soon as possible, and now it's at its final stage. Stay tuned for more updates. Happy #AI for all! #WWHO

Kusal Darshana @SuperposedK

Mar 26

Most people never see it, but frontier LLMs are quietly taxing over a billion users. Take a single, everyday Sinhala word: ආයුබෝවන් OpenAI’s o200k_base tokenizer turns it into 8 tokens for 8 characters. That’s not compression. That’s fragmentation. #AI #LLM #token @OpenAI

Remeinium

Remeinium @Remeinium

Feb 20

We are preparing for the launch of SGPE, the next big tokenization architecture. We hope it will make LLMs more understandable for any language. ~ Remeinium Research #AI #research #NLP #LLM #tokenization #SGPE

Kusal Darshana

Remeinium retweeted

Kusal Darshana @SuperposedK

Feb 19

Can you imagine a day when everyone codes in their native language? For the goal of AGI, we should train LLMs to respond to anyone in any language with the same quality and depth of knowledge. Current LLMs perform ~2.5–4× worse for non-English users. Why? Tokenizers. They consume the context window inefficiently for complex languages, resulting in 2–4× less content in responses. To solve this, we are working on an architecture called #SGPE. If it succeeds, we can reduce this token tax by 40–60%, allowing LLMs to use the context window far more efficiently, especially for abugida scripts. Then we can move toward a universal tokenizer that serves everyone equally. #LLM #Tokenization #token #tax #AI #NLP @Remeinium

Andrej Karpathy

@karpathy

24 Jan 2023

The hottest new programming language is English

374

Remeinium

Remeinium @Remeinium

Feb 17

SGPE (Syllable-aware Grapheme Pair Embedding) is coming and it will have a huge impact on tokenization. #AI #LLM #Tokenization

Kusal Darshana @SuperposedK

Feb 17

x.com/i/article/202384443088…

Remeinium

Remeinium @Remeinium

Jan 4

We’re open-sourcing UgannA Siyabasa V2, our best Sinhala FastText embedding model Built for research, products, and real-world NLP. Released under the Remeinium Open Model License. Sinhala deserves first-class language tech. Access on Huggingface. #AI #NLP #OpenSource

Remeinium

Remeinium @Remeinium

Jan 4

UgannA_SiyabasaV2 Model : huggingface.co/Remeinium/Uga… Test Live : huggingface.co/spaces/Remein… API : esdocs.ai.remeinium.com

Remeinium/UgannA_SiyabasaV2 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Remeinium

Remeinium @Remeinium

19 Dec 2025

Silent is Over! #Codisor is coming before #AGI! We are excited to share that we are preparing to launch Remeinium Codisor to help every student/developer UNDERSTAND any codebase fast. Tired by Just-VibeCoding? Here you go to Understand the Unknown! 🧠 Happy UNDERSTANDING ✨

ALT Remeinium Codisor