Joined March 2022
19 Photos and videos
Pinned Tweet
20 Dec 2025
documenting our journey so far for solving hyper-personalization in AI Agents.. a ver crucial step towards solving for AGI.. All this started as a vague project idea with @venky1701 ..making a website that helps researchers read research papers more easily.. but not like Anara or any other websites like that.. The idea was to help researchers in a very natural way..to keep this essence of reading papers..not just drop them AI summary which most of these websites do..basic RAG systems.. Our idea overtime evolved.. - Making Memora initially.. a SDK that gives memory to AI Agents but the engineering behind was not upto the mark nor it was well thought.. - We slowly re-thought the whole idea and landed on building infrastructure for the next generation of AI Agents. - Hence MetaCognition Labs was born.. we are very close to solving for self-evolving memory substrate. - Our whitepaper and public pre-print are ready and which be shared soon. We at MetaCognition Labs, are solving for the most crucial part inside human brain... Memory. Crazy to have @venky1701 @sauhard_07 in this venture. Stay tuned for the biggest drop.
1
13
887
A very big slap towards Consumerism. This just showcases how some govts and greedy politicians wants to regulate and control the AI space for their own benefits. @ylecun & many other AI researchers faith towards building a world where anyone can access and build upon, really gets a smack down here.
As a result of a US government directive, we are suspending access to Claude Fable 5 for all users. You can continue to use all other Claude models. Here’s what this means for you: Across Claude products, new sessions will run on your selected default model or Opus 4.8, and existing Fable 5 sessions will end with an error. On the Claude Platform, requests to Fable 5 will also return an error. Please update your integrations to other Claude models. We know this is a disruption to your workflows; we appreciate your patience and support.
2
20
Priyam retweeted
hi, I'm one of the co-founders of @metacognitionai and we're developing infrastructure for all kinds of AI personas. "so is it just storing huge amt. of data?" -- no, the entire system dreams of scenarios that you would want should've happened and drives the memory landscape from there. but how is it different than using a context layer? context layers are super dumb. a system that understands the interaction of a human with their environments, their emotions and their facts shapes the memory landscape. last night I just smoke tested and the results were smoking hot tbh. NGL, this is way much fun to build as well.
3
3
18
403
Priyam retweeted
I want to offer some unsolicited advice to computer vision researchers jumping into robotics. Don't focus too much on VLMs, VLAs etc. That's fine, but the real action is at the sensorimotor level. Most of the open problems in robotics are in manipulation, which is about hand-object interaction, and contacts and forces are central. Proprioception and tactile sensing are as important as vision. Don't get seduced by cherry-picked demos. You can't do robotics without doing robotics.
72
394
3,147
473,603
Have you ever noticed that, if you hum some part of a song , and suddenly the whole song actually plays in your brain ? Even if your ears aren't registering any external noise neither any sensory input is working. How does it happen ? Human Brain is really good in autonomous pattern generation. Traditional neural networks are static, and training RNNs to replicate these dynamical patterns by fine-tuning every connection is notoriously difficult due to complex temporal dependencies To solve this, there is a concept called Reservoir Computing, which is a method of building recurrent neural network in which the connections are untrained and randomly connected. Instead of the traditional way in which we try to fine-tune a RNN which is chaotic, Reservoir Computing model proposes a dynamical system that maintains an echo of the inputs. The mechanism is simple : - A large, vast set of interlocked neurons get a driving signal and when this signal ( mostly a cosine signal z(t) ) is passed through the collection of neuron, produces complex patterns and signals, which fluctuate in complex way creating a library of spatiotemporal basis. - A final readout layer observes the activity of all neurons in the reservoir. This is the only part of the system that is 'trained' - By adjusting the weights (volume knobs) of this readout layer, the system performs linear regression to combine the random internal states of the reservoir into a specific, desired output signal. The whole idea and computation is inspired from Fourier's Heat Diffusion Understanding in Metals, and Fourier Basis System. The Reservoir Activity form a rich spatiotemporal basis library which can be then used to build patterns. arxiv.org/html/2412.13212v1

1
40
We built a century of neuroscience, and then an entire AI industry, around the cells we could measure. Neurons are large and electrically active. They respond beautifully to the tools we had. So we mapped every connection, built the connector, scaled the graph, and called it intelligence. Three papers dropped in Science last May, three labs, three different species, and they all said the same thing. The cells we ignored, the astrocytes, are the ones actually running the gain control, the attention switch, the memory consolidation, and the state transitions we call mood, fear, and arousal. Every ADHD drug, every antidepressant, every PTSD medication targets this axis. The pharmacology was always right. The cell type attribution was wrong. And every neural network ever built is missing the third terminal. At @metacognitionai we are extensively working on this third terminal, understanding the bridge between memories and emotions and long horizon understanding of events. Wrote about what this actually means for AI architecture and why I think it matters for where we take DAHN next. prigoistic.me/blog/astrocyte…
1
3
82
Priyam retweeted
@ylecun recently said and I quote, "I do not understand how you can even think of building an agentic system without an agentic system having the ability of predicting the consequences of its actions. And an LLM doesn't do that." " If you really want to build reliable agentic systems, they absolutely have to be able to predict the consequences of their actions so that they can plan a sequence of actions to do something, first of all, to fulfill the task that they are being asked to fulfill, but also, perhaps, to guarantee some safety guardrails. " perhaps what we actually see today as "completely agentic systems " or even "agentic AI" is just a fancy LLM call with some spices of band-aid context management , and tool harnesses which works but its not autonomous. think again.
1
2
34
Priyam retweeted
Very good advice on self-improving agents. (bookmark it) This is something I am seeing in my own experiments with coding agents and harnesses for long-horizon tasks. What I have found is that stronger models do not always evolve better agents. The current believe in self-evolving agents is that a bigger model writes better prompt and skill edits, so devs put their best model in the evolver seat. New research shows that intuition is mostly wrong. The work separates two abilities that usually get conflated. Producing harness updates stays flat across model capability, so Qwen3.5-9B writes edits roughly as good as Claude Opus 4.6. Benefiting from those updates follows an inverted-U that peaks at mid-tier models, while weak models fail to even activate the edits and strong models have little headroom left. This is important to understand as it tells you where to spend. Put a cheap model on the evolver and your expensive model on the solver, because the gains land solver-side, not evolver-side. Paper: arxiv.org/abs/2605.30621 Learn to build effective AI agents in our academy: academy.dair.ai/
37
109
746
55,875
some thoughts about the recent paper titled "Hybrid Neural World Models " by @lossfunk The paper actually introduces a self- consistency estimation over uncertainty situations or experiments. i didnt go into the paper details completely but, skimmed over it, read the documentations and the blog written about it and i must say, the works is novel and physics-agnostic ( for trained experiments ), the architecture uses 2 types of paths during deployment : - path 1 : single shot - path 2 : two step halfway I tried comparing it with VJEPA because Joint Embedding Predictive Architectures work on a fairly simple principle. You have embeddings generated from states at x(t) and x(t 1), and a predictor network tries to generate the future embedding while minimizing the loss between the predicted and actual latent representations. VJEPA is great for general intelligence and world understanding tasks. However, when it comes to engineering problems, physics based systems, and scientific dynamics, it starts showing limitations. That is where the step doubling method proposed in this paper becomes interesting. The published paper has some novelty that are really interesting if you read it : - In VJEPA2 and similar architectures, uncertainty estimation typically requires calibration sets or conformal prediction techniques, whereas this paper introduces a purely internal, zero-calibration uncertainty estimate through step doubling. The intuition towards it is simple, - if the disagreement between the two prediction paths is low, the model trusts the fast neural prediction. - if the disagreement between the two paths is high, the system falls back to a classical solver. unlike VJEPA style systems... where prediction quality is only known after comparing against ground truth, this method attempts to estimate trustworthiness internally during deployment.
1
2
57
One of the major criticisms the paper raises is that self supervised dynamics learning can encourage a model to predict itself rather than the underlying system. - The easiest way for a network to remain consistent is to learn an identity mapping. Instead of learning the actual dynamics..it simply outputs something very close to the current state because that minimises error and preserves consistency. - The paper argues that training should remain supervised while the dynamics are being learned. Once the dynamics are learned, self consistency can then be used during deployment as a trust signal rather than as the primary learning objective. but while i was reading the paper not completely just skimming around i did identify some grey zones which i might not be correct about, but they are my first order thinking outcomes so sharing them here : - one thing that struck me was , what if the network learns a bad signal and both path 1 predicts the wrong state and path b predicts the same wrong state because the network is consistent in its errors. - The trust signal now says everything is fine. But the prediction is still completely wrong. here the whole system relies on the network being scale consistent, but it cannot guarantee physics consistent, it the networks learn a wrong dynamics that is internally consistent across time scales, the step doubling signal fails to detect it.
1
39
another limitation (im not sure about it as havnt read the whole paper yet ) in the paper it mentioned that if the trust signal is high , the system fallbacks in to a classical solver, most physics agnostic problems have chaotic regimes. the system also relies on the network learning the specific dynamics of the training systems (Oregonator, Euler, Balls). Applying this to a new physical system (e.g., a different fluid with turbulence not seen in training) requires retraining. Overall, I think this is a meaningful step forward in uncertainty estimation for learned dynamics. But it does not solve the deeper problem of guaranteeing physical correctness without ground truth. It is a "best effort" surrogate that requires careful monitoring and fallback mechanisms. would love to know more about the thinking process @Avg_sapient and @paraschopra
46
Priyam retweeted
And suddenly the world is now all about Recusive Learning systems.. Some are building Recursive Improvement Agents... some are using it for finding new information and stuffs.. and scaling it to startups. SEALs and RLS are like ~4 years old architectures now and recent RLMs are just an recusive layer over existing MHTs. I personally worked on the exact same field for more than a year or two. Self Improving Architecture for Scientific Discovery based on Neuro- Symbolic Agents. I build some architecture over RLS and Syllabus based learning over deepseek-math-7b abstract, quantised and reset its model weights and used the abstract learning dataset from @GoogleDeepMind and build an RLHF pipeline over it. Performance was on par with exisitinf Llama-7b and even gpt- o3 models but the issue was scaling, i built the architecture over energy landscapes and scaling EBMs is something still i struggle with alot coz im clueless how to do it. And the dynamics of the model sustains on my architecture coz of the whole energy simulation and covergence principles. These principles allowed generation of new data from exisiting input datas and hence a syllabus based learning. But i knew in prod this wont work out and it didnt, bigger prompts and context window was a problem and many times it too way longer to reply on simple queries than even poorer models. The outcome was simple: - Simple tasks are faster on single pass than recursive approach. - Works well on mathematical queries but struggled with language ( as a wrote a custom HMTT tokeniser ) - High Hallucination feeding issue, its like a vicious cycle if you dont have a proper Prediction Directing system. and the biggest being scaling, i tried it on Qwen30B but didnt workout, maybe it was an architectural problem but i didnt go ahead with it. The outcomes - wrote 5 papers on self learning engines. - RLS on prod is way far than today and RLS IS NOT CONTINUAL LEARNING. here's the opensource codebase: github.com/prigoistic/ananta…
1
2
3
148
Priyam retweeted
lately I have been spending a lot of time thinking about the brain not just in a philosophical way but in a mathematical way.. I have been diving deep into computational neuroscience trying to really validate the maths I have been building and grounding it in how biological memory systems actually behave. the more I study it the more I feel like spectral graph theory diffusion and iterative solvers are not just abstract math concepts they genuinely resemble how the brain stabilises forgets consolidates and compresses meaning... I wrote a long piece connecting spectral graph theory with advanced memory dynamics in the human brain. if you feel you fw this then give this blog a read and if you are a specialist in this field..do reach me out if you feel to discuss on this further. prigoistic.me/blog/spectral-… also check out @metacognitionai this is something we do here everyday.

1
2
4
243
Priyam retweeted
Hectic last few days ngl.. - Co authored our paper at UCB for Spatial models. - Submitted our first YC Application for the Summer Cohort - Submitted our paper on associative memory at NeurIPS - studied a bit on core mathematics behind SSLs. - was an OC at India's biggest Gen AI Hackathon. And much more ngl.
1
3
185
Priyam retweeted
transformers have been secretly running hopfield networks this whole time. every attention head = a hopfield energy minimization step. ramsauer et al. (2020) showed this and it's elegant. but there's a catch: standard hopfield attention treats every stored pattern as equally reachable. throw in noise or a dense memory bank, and the softmax spreads thin. wrong patterns bleed through. so we asked: what if patterns could talk to each other before retrieval? we built a kNN graph over the stored patterns, computed the graph laplacian, and pre-smoothed everything with heat diffusion before each association step..related patterns cluster and spurious neighbours fade. the result ? difflayers : consistently outperforms vanilla MHNs on noisy retrieval, with four diffusion modes scaling from O(kNd) to exact spectral kernels. drop-in replacement. same forward signature. just sharper memory. pip install difflayers github: github.com/metacoglabs/diffl…
1
2
60
Priyam retweeted
Ai Harnesses might be the next big thing for Enterprises tbh, and no harness doesnt only mean finding the best agent for your llm to perform tasks, its more about proper tool registry, models and context management cause there are too many variables in a pipeline which we maynot have control over. As @TejasKumar_ spoke about at @aiDotEngineer , the importance of Agentic Harnesses is visible, even the shittiest models can perform significant tasks with proper accuracy if the recursive loops are written proper and harnesses are developed on point.
1
2
3
125
Priyam retweeted
opus 4.8 just dropped and it might be one of the first cases where recursive learning systems has been implemented a bit, as per what i read along the articles by @AnthropicAI , claude can now recheck its thought and plan before even starting building, like a feedback loop thingy which allows it to recheck if what ever it thought is correct before going ahead with it. - dynamic workflows, during orchestration, claude actually deploys many smaller subagents to take up the task and then do it end to end. - native planning layer, it basically plays devil's advocate against its own logic to catch structural flaws and bugs before it even writes a single line of code, which feels way more like a senior engineer mapping out a blueprint. - built in orchestration, instead of developers needing complicated external frameworks to stitch agents together, the model handles the routing and micro pipelines natively which cuts down on latency and token burning. RLS have never been implemented on large scale coz of too many risks that it contains. Current opus 4.8 might somewhat implement it, but rather a completely autonomous learning system its more like one still under human feedback. i did work on recursive learning models last year on syllabus based learning .
1
1
1
82
And suddenly the world is now all about Recusive Learning systems.. Some are building Recursive Improvement Agents... some are using it for finding new information and stuffs.. and scaling it to startups. SEALs and RLS are like ~4 years old architectures now and recent RLMs are just an recusive layer over existing MHTs. I personally worked on the exact same field for more than a year or two. Self Improving Architecture for Scientific Discovery based on Neuro- Symbolic Agents. I build some architecture over RLS and Syllabus based learning over deepseek-math-7b abstract, quantised and reset its model weights and used the abstract learning dataset from @GoogleDeepMind and build an RLHF pipeline over it. Performance was on par with exisitinf Llama-7b and even gpt- o3 models but the issue was scaling, i built the architecture over energy landscapes and scaling EBMs is something still i struggle with alot coz im clueless how to do it. And the dynamics of the model sustains on my architecture coz of the whole energy simulation and covergence principles. These principles allowed generation of new data from exisiting input datas and hence a syllabus based learning. But i knew in prod this wont work out and it didnt, bigger prompts and context window was a problem and many times it too way longer to reply on simple queries than even poorer models. The outcome was simple: - Simple tasks are faster on single pass than recursive approach. - Works well on mathematical queries but struggled with language ( as a wrote a custom HMTT tokeniser ) - High Hallucination feeding issue, its like a vicious cycle if you dont have a proper Prediction Directing system. and the biggest being scaling, i tried it on Qwen30B but didnt workout, maybe it was an architectural problem but i didnt go ahead with it. The outcomes - wrote 5 papers on self learning engines. - RLS on prod is way far than today and RLS IS NOT CONTINUAL LEARNING. here's the opensource codebase: github.com/prigoistic/ananta…
1
2
3
148
Im actually eager to go back and try working on it now after knowing this is back as a hot topic.
1
62
opus 4.8 just dropped and it might be one of the first cases where recursive learning systems has been implemented a bit, as per what i read along the articles by @AnthropicAI , claude can now recheck its thought and plan before even starting building, like a feedback loop thingy which allows it to recheck if what ever it thought is correct before going ahead with it. - dynamic workflows, during orchestration, claude actually deploys many smaller subagents to take up the task and then do it end to end. - native planning layer, it basically plays devil's advocate against its own logic to catch structural flaws and bugs before it even writes a single line of code, which feels way more like a senior engineer mapping out a blueprint. - built in orchestration, instead of developers needing complicated external frameworks to stitch agents together, the model handles the routing and micro pipelines natively which cuts down on latency and token burning. RLS have never been implemented on large scale coz of too many risks that it contains. Current opus 4.8 might somewhat implement it, but rather a completely autonomous learning system its more like one still under human feedback. i did work on recursive learning models last year on syllabus based learning .
1
1
1
82