Joined March 2026
2 Photos and videos
Thought LLMs were just prompt→response Then I dug deeper • Why are production systems so much faster? • Why memory becomes a bottleneck? → Continuous batching → Paged attention → Speculative decoding Harder problems begin underneath: latency, memory, throughput, serving.
1
2
12
Thought research reproduction meant: Read paper → run code → verify results. Reality 🥲 Working through BioReason (DNA Qwen GRPO): broken imports, version mismatches, flash-attention/NVCC issues, Slurm debugging, config tuning… Fix one thing → another breaks.
1
2
61
Coding with AI still feels broken. Repo in mind. Error visible. Yet we stop… to type explaining context again. Why still code like this? Been building Voker: a repo-aware voice coding agent. AI going in wrong direction? Just speak again. No restart. No prompt. No broken flow
3
5
117
Went to verify a paper's results, I was like "Yeah! Let's do something fun today🫠" Reality: - random dependency issues - glibc confusion - disk quota errors Now, my hair is messy, eyes are sleepy and here I'm questioning my life choices 😭.
2
118
RAG can retrieve “relevant” chunks and still miss the exact answer. Similarity ≠ correct context. Compared vector RAG vs PageIndex (cached indexing → retrieval only) Different retrieval → different answers. Try it: vidhan66-compare-rag.hf.spac…

3
117
Everyone’s talking about OpenClaw. I tried building a 2-agent setup with a validator skill. Skill was: - registered - visible But never applied. In LLM frameworks: registration ≠ execution Full version: linkedin.com/posts/vidhan-ba…
4
84
Most people use routing in LLM systems for intent: → query type → pipeline But the real use is when the system isn’t confident: → retry → ask follow-ups → expand retrieval Routing = handling failure states, not just intent.
2
47
Most notes are useless. You read → don’t get it → go to YouTube. I’m building: Upload notes/PDF → AI generates a short whiteboard teaching video. Trying to validate if this is actually useful. 1 min form: forms.gle/NhRa1CELwKTUG4k57 Brutal honesty > fake validation.
1
32
Most LLM apps fail here: They answer incomplete queries. Better approach: Detect missing fields → ask targeted questions → then answer Not the other way around.
1
24
Why refusal logic is necessary in LLM systems — One thing I noticed while building LLM systems: The model almost always answers even when it shouldn’t. That’s the core issue behind hallucinations.
1
1
28
For scaling: - ANN-based indexing becomes important as the dataset grows (I haven’t implemented yet) Takeaway: Hallucination control isn’t just a model problem. It’s a system design problem. If you don’t define when to refuse, your system will always answer even when it’s wrong.
1
1
21
Next, I’ll share how I handle ambiguity and make LLMs ask better follow up questions instead of answering too early. If you are also into this field, I would love to know your approach and where I can improve it.
1
19
How my first LLM app turned into a maintenance nightmare — At the start everything lived in one file: • prompts • retrieval logic • API calls • tools • database queries It worked… until the system started growing.
1
1
16
But prompts were still embedded in code. That made iteration painful. Every prompt change required touching application logic. The final improvement: Externalizing prompts into YAML configs. Now prompt changes didn't require modifying the core system.
1
1
16
Big realization: LLM apps behave more like systems engineering problems than simple scripts. Modularity becomes important much earlier than expected.
1
16