piaoyang

piaoyang

5 Photos and videos

Tweets

piaoyang @pycui64

Jun 9

who would think we got Sophon (an artificial particle designed to block human technology advancement, in the series of Three Body Problem) earlier than Trisolarans

NomoreID

@Hangsiin

Jun 9

When Fable 5 is used for frontier LLM development, it does not notify the user and instead limits the model’s capabilities through methods such as prompt modification, steering vectors, and PEFT. Anthropic estimated that this would affect approximately 0.03% of traffic.

169

Kevin Hou

piaoyang retweeted

Kevin Hou

@kevinhou22

Jun 2

Why did we split the IDE from the Agent Manager? We launched Antigravity 2.0 at Google I/O a couple weeks ago. We made some bold choices and I wanted to share my thoughts... [1/4] Adapt or Die I’ve been working in the AI dev tool space for the last three years. Three years is an eternity in this space and I can confidently say that I’ve seen tools, startups, and workflows come and go. Many of the features our team helped pioneer such as autocomplete and RAG-based chat have fizzled away. Every few months there is a paradigm shift that redefines the AI developer tool landscape. Miss it, and you slowly fade into the backdrop. Time it well and you stay in the game. Invent it and you get catapulted into the spotlight. It’s a fast-paced, unforgiving industry. I love it. The team loves it. What do I mean by paradigm shifts? In the last three years, we’ve gone from: Autocomplete → Chat → Agents → Multi-Agents Each phase forced us to redefine ourselves: Codeium → Windsurf → Antigravity. The reason why I tell you this is because as painful as it is, we have to continuously evolve our products. The model and the product are increasingly synergistic. The product is only as good as the model and the model gets better with the product. AI technology moves too quickly to keep the status quo. For example, we deleted chat in Windsurf when we launched in November 2024 (x [dot] com/kevinhou22/status/1892246469303820745). It was a controversial move at the time since users only knew about chat and had never used an agent. But we trusted that the “future is agentic” and I’m glad we did not back down. My goal — our team’s goal — is to give our users the best dev tool on the market to let them build beyond their wildest dreams. To give our users the best product, the product must evolve. [2/4] Current Shift: Agents → Multi-Agents At Windsurf, we introduced the first agentic IDE. That was in November 2024. We’ve come a long way since then. Models have gotten stronger, users have gotten more tenacious, and expectations are higher than ever. It’s no longer enough to run just one agent. You are now an orchestrator of many agents: multiple conversations or agents spawning their own subagents. To bring this multi-agent paradigm to our users at Antigravity, we introduced Antigravity 1.0. Highly capable models made it possible for agents to do more complex work and work for longer. The benefit of working at a lab is seeing around the corner and molding the model to fit the product. We worked with the Gemini team for months before we felt that Antigravity was ready for prime time. On November 18, 2025, we released Antigravity 1.0 alongside Gemini 3 Pro. It had two surfaces: 1) AGY IDE 2) Agent Manager Two surfaces, bundled into one app. Some users loved both. Some only used one. Some just needed Gemini to get better. [3/4] Now: Better Models Doubling Down Over the last 6 months, we’ve been working with the research team to help improve Gemini’s coding and agentic capabilities. We’ve also been dramatically improving our internal version of Antigravity, and as Sundar announced (youtube [dot] com/live/wYSncx9zLIU?si=8nXDk_WhvoCvx12k&t=1504), we're processing over 3 trillion tokens a day. Additional breakthroughs in model research enabled Gemini 3.5 Flash — a lightweight model at the Pareto frontier (performance vs. cost & speed). Based on internal usage, we knew there were two camps: IDE and Agent Manager (uppercase). Antigravity IDE is a great product. Good because it’s familiar. Good because you can manually edit your code. Great because of its agent. We started exporting our agent to other surfaces to build an ecosystem of tools backed by the same powerful Antigravity agent: IDE, Agent Manager, SDK, CLI. That ecosystem is what we announced on stage last week at Google I/O. But Agent Manager was pulling away. Agent Manager usage was increasing in dramatic fashion and we knew we wanted to double down on this experience. Both technical and non-technical folks are relying on Antigravity every day to not only write code, but also write docs, do competitive research, design prototypes, conduct user simulations, summarize 1:1s, learn new concepts, file expenses, the list goes on and on… So we made a bold choice to split the two surfaces into two apps. We now have two applications: 1) Antigravity: unapologetically agent-first 2) Antigravity IDE: the editor you know and love You can choose which product you want to use. You can switch between the products. Same agent, different surfaces. Now, I’ll admit we botched some details about the migration and we’re working day and night to make it right. We have plenty of exciting features in the pipeline and I’m excited for you all to get your hands on them. [4/4] Competition = Users Will Win Times are changing rapidly. AI dev tools are quickly becoming agent-first. Users are managing tens if not hundreds of agents. Antigravity 2.0 is our way of giving this power to our users. You can take a look at the competitive landscape and see this for yourself. IDE’s trying to evolve to a world where work is the product and not the code files. Labs releasing agent-first products, exploiting the model <> product synergies. Users spending more and more time and tokens in agent managers (lowercase). Antigravity Agent Manager (uppercase) might’ve been first, but ideas are cheap these days. The leader will not necessarily be the winner and products models will be constantly evolving to meet growing expectations. Ultimately, users will win. — I want Google Antigravity to be the place where developers build. I also want Antigravity to be the place for knowledge work to get done. Code is becoming an implementation detail. People from all walks of life are and will use code to solve their problems, without necessarily knowing they are “coding”. Antigravity is at the frontier and will continue to innovate for you. Thanks for helping shape the future of the product. Keep the feedback coming. More soon.

488

36,260

piaoyang

piaoyang @pycui64

May 19

Try Gemini 3.5 Flash, it's a good model it has also been super fun helping make the OS kernel and (free)Doom demo. agents ftw 🚀

Google DeepMind

@GoogleDeepMind

May 19

Introducing Gemini 3.5: our newest family of models combining frontier intelligence with real-world action. The first release is 3.5 Flash, our strongest model yet for agents and coding 🧵

273

piaoyang

piaoyang @pycui64

Mar 10

it has been 10 years since the legendary alphago match. it may sound like hindsight but one should be able to foresee what we have today back then knowing the power of deep learning. and remember we didn't even have transformer then. onwards for the next 10 years

437

SpaceX

piaoyang retweeted

SpaceX

@SpaceX

Feb 2

SpaceX has acquired xAI, forming one of the most ambitious, vertically integrated innovation engines on (and off) Earth → spacex.com/updates#xai-joins…

3,877

7,722

45,180

19,328,233

piaoyang

piaoyang @pycui64

19 Dec 2025

github.com/DGoettlich/histor… great that someone made this happen!

GitHub - DGoettlich/history-llms: Information hub for our project training the largest possible...

Information hub for our project training the largest possible historical LLMs. - DGoettlich/history-llms

github.com

piaoyang @pycui64

5 May 2024

We are accustomed to the fact that LLMs know the modern world and particularly the tech of themselves (transformer, etc). ChatGPT can casually explain how itself works to you. However the knowledge doesn't have to align with the existence of themselves. We could totally imagine training a LLM with pre-deep learning, pre-computer, or pre-industrial revolution data, while still being highly intelligent. You can thus simulate an ancient human and see how it perceives and thinks about modern world, and is puzzled by its own existence. Seems fun!

1,075

xAI

piaoyang retweeted

xAI

@xai

17 Nov 2025

Introducing Grok 4.1, a frontier model that sets a new standard for conversational intelligence, emotional understanding, and real-world helpfulness. Grok 4.1 is available for free on grok.com, grok.x.com and our mobile apps. x.ai/news/grok-4-1

Grok

Grok is an AI assistant built by xAI. Chat, create images, write code, and get real-time answers from the web and X.

grok.com

1,891

2,090

12,879

39,889,759

piaoyang

piaoyang @pycui64

20 Sep 2025

fast and furious

xAI

@xai

19 Sep 2025

Introducing Grok 4 Fast, a multimodal reasoning model with a 2M context window that sets a new standard for cost-efficient intelligence. Available for free on grok.com, grok.x.com, iOS and Android apps, and OpenRouter. x.ai/news/grok-4-fast

1,387

piaoyang

piaoyang @pycui64

19 Sep 2025

x.ai/news/grok-4-fast Best search model in the world! Super proud of the team's achievement

Grok 4 Fast

Pushing the Frontier of Cost-Efficient Intelligence

x.ai

1,778

piaoyang

piaoyang @pycui64

28 Aug 2025

try Grok Code Fast 1 -- "it's a good model"™

xAI

@xai

28 Aug 2025

Introducing Grok Code Fast 1, a speedy and economical reasoning model that excels at agentic coding. Now available for free on GitHub Copilot, Cursor, Cline, Kilo Code, Roo Code, opencode, and Windsurf. x.ai/news/grok-code-fast-1

140

6,236

piaoyang

piaoyang @pycui64

13 Aug 2025

has anyone in gdm tried starting genie with a photo of a computer running genie and see if you can control the computer to play genie inside genie?

1,996

piaoyang

piaoyang @pycui64

21 Jul 2025

Congrats to all the contestants of IMO 2025, and special kudos to contestants who got scores of 36 or higher. This may be another "Lee Sedol moment": these talented students may be the last humans to ever beat computer in a major math competition.

1,800

OpenRouter

piaoyang retweeted

OpenRouter

@OpenRouter

15 Jul 2025

Grok 4 and Kimi K2 competing on top of the Trending models charts

124

189

2,862

5,409,140

piaoyang

piaoyang @pycui64

12 Jul 2025

The scaling continues

Toby Pohlen

@TobyPhln

11 Jul 2025

Our official Grok 4 blog post x.ai/news/grok-4

1,994

piaoyang

piaoyang @pycui64

10 Jul 2025

About 10 yrs ago when I first joined google, I thought about how common for multiple companies to build the same stuffs over and over individually (be it ads bidding, recommend system, or distributed data pipeline). Surely it's good for competition, but it still feels a bit wasteful, especially as programmers we all have been told to not "reinvent the wheel". It's only after I joined @xai that I realized this also applies to engineering skills and experience. Why do so many engineers have to learn the same raw skills and project experience, when we can use the knowledge and compute to train a single, distributable being to excel at it? Isn't it a bigger waste? (it's fun to learn personally though, I admit) so yes, I'm very looking forward to the upcoming code model, soon :) the future may be uncertain, but it's surely exciting 🤘

2,150

ARC Prize

piaoyang retweeted

ARC Prize

@arcprize

10 Jul 2025

Grok 4 (Thinking) achieves new SOTA on ARC-AGI-2 with 15.9% This nearly doubles the previous commercial SOTA and tops the current Kaggle competition SOTA

232

688

4,935

7,306,970

piaoyang

piaoyang @pycui64

10 Jul 2025

let's go 🚀

xAI

@xai

10 Jul 2025

The Grok 4 livestream will begin soon. Stay tuned.

202

22,505

Elon Musk

piaoyang retweeted

Elon Musk

@elonmusk

7 Jul 2025

Grok 4 release livestream on Wednesday at 8pm PT @xAI

14,209

9,208

79,653

39,132,172

Qian Huang

piaoyang retweeted

Qian Huang

@qhwang3

23 Feb 2025

It’s been quite an unbelievable ride since I paused my PhD at Stanford and joined @xai almost a year ago. The journey (building the tool use/agent stack from scratch to demoing DeepSearch as a little research project to converting it into a product launched to millions of people) has been incredible, and it couldn’t have been done without the “real engineer” @pycui64 and many many others. I believe this opportunity offered by xAI was impossible anywhere else and I am very grateful. I hope DeepSearch is helping you like it’s been helping me: finding out best way to watch lava in Hawaii tonight, figuring out what X thinks about DeepSearch, settling a random argument with @ericzelikman 😛 We are quickly improving it in all facets, so please share your experiences and feedback with us! And join us if you want to make agents useful!

154

158

3,347

428,430