Joined February 2026
9 Photos and videos
Got my stipend today. Not much yet but working to make it bigger.
1
410
Non technical client:we need real time video processing Me:we should use nvidia… Non technical client:Lets use tpu to train and deploy our system they are free on google collab Me:yeah but you wont be able to host it on prem Non technical client:no we can,we will just buy the tpu’s Me:nice
1
299
Deployed a computer vision model using ONNX recently. Thought it’d be straightforward it wasn’t. First issue: outputs didn’t match PyTorch. Not completely broken, but enough drift to make predictions unreliable. Took a while to realize some ops weren’t translating cleanly during export. Then hit unsupported operators depending on the opset. Had to experiment with different versions tweak the model a bit just to get a valid graph. Performance was another surprise ONNX Runtime on CPU was actually slower than expected. Only improved after switching execution providers and testing with TensorRT. Also learned the hard way that dynamic shapes can silently mess things up if you’re not careful. Ended up using opset tuning graph simplification better runtime configuration to stabilize things. Main takeaway: training the model is the easy part. Deployment is where things actually break.
1
230
Built a real-time intrusion detection pipeline using YOLOv8n defined an ROI polygon (red zone) detect person bboxes per frame assigned IDs via simple tracking (centroid/IOU) → green if outside, red if inside added basic debounce to avoid alert spam frame skipping for latency still tuning for shadows, occlusion, and camera angle edge cases Checkout below
1
1
197
Another test case where people in a bank
118
Most drone data is useless. Mine isn’t. I built a drone-based indexing system that turns raw aerial footage into searchable intelligence. • BLIP → generates dense image captions • Custom scoring engine → ranks every keyword by relevance • spaCy → auto-generates validates tags • Geospatial filtering → query data by location, not just text Stack: Python (Flask), JS, full custom pipeline. This isn’t a demo. It’s a step toward making unstructured visual data actually usable. If you’re building in CV, geospatial, or defense this is where things are going. Checkout the project- repo1-production-5fd4.up.rai…
1
154
Computer vision engineers be like: Model is 99% accurate Meanwhile YOLO confidently detects a toaster as a dog because it saw two circles and decided close enough
98
Goodbye Firebase Studio 2025–2027 Classic Google arc: experiment → validate → absorb Now it folds into Google AI Studio Antigravity If you built on it, you felt it.
144
Why most Vision-Language Models hallucinate (and how researchers are fixing it) Lately I’ve been going deep into Vision-Language Models (VLMs). Models like •GPT‑4V •GPT‑4o •Gemini •LLaVA look insanely impressive. But there’s a big problem most people miss: VLMs hallucinate. A lot.
1
1
151
The biggest insight after studying VLMs: Scaling the model does NOT automatically fix hallucinations. What matters more: • grounding • perception modules • verification loops • structured reasoning
1
113
Curious: If you’re building with VLMs, what techniques are you using to reduce hallucinations? Grounding? Detectors? RAG for images? Drop ideas below
90