I’ve always wanted to write an open-notebook research blog to (i) show the chain of thought behind how we formed hypotheses, designed experiments, and articulated findings, and (ii) lay out all the intermediate results that did not make it into the final paper, including negative ones that we believe others will find interesting.
mars-tin.github.io/blogs/pos…
So
@fredahshi and I wrote about grounding, a topic that sparks a lot of debate, both in VLM engineering and in linguistics and philosophy. Our latest work shows from a mechanistic interpretability perspective that symbol grounding can naturally emerge from mid-layer aggregate heads, without requiring fine-grained supervision or any special architectural inductive bias.
Feel free to check it out. We tried to find a balance between open-notebook transparency and readability. It looks best on desktop since I gave up on engineering that CSS file.😅