Data Scientist, Maryland State Delegate District 9A (Howard and Montgomery Counties),By Authority of Friends to Elect Chao Wu, Treasurer: Xia Chen.

Joined June 2015
878 Photos and videos
Looking forward
3
203
Delegate Chao Wu retweeted
Make a plan to vote today: abigailspanberger.com/vote Then get your friends, family members, neighbors and coworkers to make a plan to vote, too. Because if we do, we will elect @SpanbergerForVA as your next governor and put Virginia on the path to a brighter future.
822
339
1,541
594,936
GPT, not going down at all.
2
123
We need be really careful of a few tech companies to use the government power to be the monopoly, then use the monopoly to grab more money and power. It is destroying the country and the people.
1
2
143
Delegate Chao Wu retweeted
12 Oct 2025
The Art of #Statistics — How to Learn from Data: amzn.to/48c8xpG
4
145
986
80,068
2
81
Adding a physical layer boundary(framework), and a Kalman filter to describe the state space transition will decrease the search space by several magnitudes and increase the robustness of VLM system. Sharing thoughts based on my control theory background.
So the key concern is: Using large language models to initialize vision-language(-action) models is a tempting trap — it lets us appear to make progress without truly achieving it. Most benchmarks have overwhelmingly focused on reasoning and digital domains, without fundamentally addressing perception, especially mid- and low-level vision. (Credit: Partly inspired by separate conversations with @xiangyue96 and @YutongBAI1002) As humans, we clearly exhibit pre-linguistic roots in our intuitive physical and psychological understanding, e.g., basic principles like solidity, continuity, and gravity. After we built GroundHog (arxiv.org/abs/2402.16846) in 2024, I took a moment to reflect on the core issues with VLMs. I can no longer convince myself that simply stacking CLIP and DINO with a few projection layers is the ultimate solution to "tokenize" vision. Vision–language models need a much stronger vision foundation, perhaps a fundamental restart from a vision-centric perspective. That’s why I stepped away from VLM development for a year to explore alternatives. A paper @TairanHe99 shared in this thread (led by the brilliant @TongPetersb) was especially thought-provoking. But to truly start over, I began looking into 3D foundation models and video diffusion models, setting aside, for now, the possibility of joint vision–language diffusion models. This led me to take the risk of developing 4D-LRM (arxiv.org/abs/2506.18890), aiming to learn 4D priors at scale with absolutely no language prior. This is only a first step. At some point, I plan to return to VLM engineering. But next time, I hope I have resources to start with a world model first and then unlock the language component on top of it.
1
203
In our doctor’s office. Beauty is everywhere.
2
99
Delegate Chao Wu retweeted
The AI Agent Staircase
5
143
707
54,831
Happy First School Day.
3
122
Beautiful summer end
1
67
The big eyes
1
3
155
The AI world is bifurcating and converging . Now Gemini will not generate videos containing “Donald Trump” and MiniMax will not generate videos containing “Xi Jinping”.
1
188
This is crazy. BALTIMORE (WBFF) — Speed cameras on the I-83 Jones Falls Expressway have issued more than $18.5 million in fines in the past three years, but about 80% of the revenue has gone to the camera vendor, Verra Mobility — not the city, according to the Baltimore City.
Speed cameras on the I-83 Jones Falls Expressway have issued more than $18.5 million in fines in the past three years, but about 80% of the revenue has gone to the camera vendor, Verra Mobility — not the city, according to the Baltimore City Department of Finance. bit.ly/3Tu4jVu
1
4
299
GPT.
3
110
Delegate Chao Wu retweeted
High-value skills in 2025 (and beyond).
12
259
2,162
190,527
Prayers.
Today, we’re holding Minnesota State Rep. Hortman, Minnesota State Sen. Hoffman, and their loved ones in our thoughts. We stand in solidarity with our colleagues in Minnesota—and remain committed to rejecting all forms of hate.
1
116
Delegate Chao Wu retweeted
Public Speaking Secrets
4
308
1,743
201,686
Enjoy the flowers and the spring.
2
79
GPT
2
95