@cursor_ai, created same.energy, started @SupermavenAI and @Tabnine, formerly @OpenAI

Joined June 2020
55 Photos and videos
Pinned Tweet
16 Apr 2024
Why we won't be replaced by AI There’s been a lot of talk about AIs replacing programmers recently. The National Post: “AI is coming after the tech bros and their easy money”. Emad Mostaque, CEO of Stability AI: “There Will Be No Programmers in Five Years”. Some people are changing their career decisions as a result. Investors see the potential too: companies have raised hundreds of millions of dollars promising to replace human developers. Given the salaries that skilled developers can earn, there’s a lot of money in finding a way to cut us out of the picture. But it’s not going to happen. We’re not going to be replaced by the machines. Now, I’m not going to fall into the trap of past AI detractors by predicting deep learning is going to “hit a wall” and being proven wrong 6 months later. I think we’ll continue to see progress as AI systems get more intelligent and new applications are discovered, and I think AI will create a lot of economic value. (That’s why I’m working on an AI company.) And a lot of arguments for why humans will stay relevant boil down to “AI doesn’t work today, so it won’t work in the future”. That’s not reassuring because the whole reason people are worried about this is the rapid rate of progress. But the predictions that this progress will lead to humans being replaced are incorrect - here’s why. An AI can only learn a task when we have a way for the AI to perform the task and then be judged on whether it did the task correctly. The two main tasks that are currently used to train AI are: 1. Next-word prediction: the AI is given a document of text from the internet and asked to predict the next word in the document. 2. Short response generation (RLHF): the AI is given a query or prompt (eg. a user asking for coding help) and asked to generate a response (eg. a solution to the user’s problem). The response is scored by humans on its helpfulness and accuracy. Training AIs on these tasks has led to extremely useful products like ChatGPT, and ChatGPT is great at helping developers in day-to-day work, so it’s reasonable that people think systems like ChatGPT might soon eclipse us. But software development is not just about writing individual functions and short responses. A key part of software development is making technical decisions, like software architecture or project prioritization, that will pay off over the long-term on the scale of 6 months to 5 years. To train AI to do this, we would need a way for it to make technical decisions and a way to judge those decisions for their long-term soundness. We don’t know how to do that, and we especially don’t know how to do that at the scale required to train a large AI model. By definition these sorts of decisions are hard to judge, with no objective criteria for success, only difficult tradeoffs. The performance of AI is always going to be limited by the tasks where it can have a feedback signal, and this makes long-term planning difficult for AI because there is no naturally occurring training data (unlike next-word prediction) and there is no way to quickly assess whether a decision was good or bad (unlike short response generation/RLHF). Training a good next-word prediction model requires a loop of hundreds of thousands of training steps, which is possible because next-word prediction is quick to evaluate, but running the same number of training steps on decisions where you have to wait 1 year to see if the decision was good or bad would take hundreds of thousands of years. We humans are good at this because our minds have been shaped by millions of years of evolution, but AI doesn’t have that time. Some people think we can avoid difficulties like this by having the AI judge its own responses. While this can work to bring a weaker model up to the level of a stronger model, it can’t work to have a strong model improve itself, because if it worked it would imply the ability for a model to continually self-improve in isolation without interacting with the external world. For learning to occur, there has to be a feedback signal coming from external data. The future of AI is continued progress on tasks where we have a good way to train AI to solve them, and continued lack of progress on the tasks where we don’t. Instead of worrying about AI replacing software developers, the right move is to integrate AI tools into your workflow and use AI for what it’s good at while reserving high-level, long-term decisions to be decided by humans, not AI.
32
28
205
45,799
Jacob Jackson retweeted
C2.5 is the same pretrain as C2, but powered by a much better and stronger midtrain (nearly an OOM more FLOPS)! The base model matters a ton for RL, so we're very excited for the power of Colossus 2 to push this way further
Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.
8
3
199
12,318
Our new model is out. It stacks up nicely against the frontier!
Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.
4
105
4,388
Jacob Jackson retweeted
We’re introducing Cursor 3. It is simpler, more powerful, and built for a world where all code is written by agents, while keeping the depth of a development environment.
735
907
9,357
2,589,126
Excited to share our work on training Composer with on-policy implicit feedback!
Earlier this week, we published our technical report on Composer 2. We're sharing additional research on how we train new checkpoints. With real-time RL, we can ship improved versions of the model every five hours.
5
2
77
9,337
Composer 2 technical report - excited to share details about how we trained the model!
Replying to @cursor_ai
Read the full report: cursor.com/resources/Compose…
2
1
61
5,241
Jacob Jackson retweeted
We're releasing a technical report describing how Composer 2 was trained.
169
478
5,064
1,262,780
Jacob Jackson retweeted
This is my first post on the Cursor blog: an interactive survey on the state of the art for n-gram indexes.
Cursor can now search millions of files and find results in milliseconds. This dramatically speeds up how fast agents complete tasks. We're sharing how we built Instant Grep, including the algorithms and tradeoffs behind the design.
18
19
320
34,636
Composer 2 is frontier-level on coding benchmarks. You should try it!
Composer 2 is now available in Cursor.
2
28
1,950
Jacob Jackson retweeted
It was kind of amazing how many RL challenges in this run were bootstrapped by earlier Composers. Interesting times.
Composer 2 is now available in Cursor.
5
6
127
9,367
Jacob Jackson retweeted
a ton of work went into building composer 2 and it's a good model! try it and let us know what you think - x.com/cursor_ai/status/20346…

Composer 2 is now available in Cursor.
3
2
35
1,891
Jacob Jackson retweeted
a lot went into this model. it was fun! i hope people enjoy it.
Composer 2 is now available in Cursor.
24
11
244
21,227
Jacob Jackson retweeted
Very excited for the world to try this model! People like it a lot internally at Cursor - feels frontier-level smart and extremely fast
Composer 2 is now available in Cursor.
6
5
101
6,322
Jacob Jackson retweeted
New post: how we do evals at @cursor_ai. Takeaways: 1. Online metrics from real Cursor requests provide construct validity 2. CursorBench: a dynamic offline suite distilled from online learnings 3. Multi-axes evals -- correctness, efficiency, agent interaction behavior
We're sharing a new method for scoring models on agentic coding tasks. Here's how models in Cursor compare on intelligence and efficiency:
5
18
147
39,256
Jacob Jackson retweeted
After adopting Cursor, businesses merge ~40% more PRs each week. New economics research from the University of Chicago.
Who uses AI agents? How do agents impact output? How might agents change work patterns? New working paper studies usage impacts of coding agents (1/n)
48
65
921
225,903
Jacob Jackson retweeted
5 Nov 2025
Semantic search improves our agent's accuracy across all frontier models, especially in large codebases where grep alone falls short. Learn more about our results and how we trained an embedding model for retrieving code.
73
105
1,486
880,895
Jacob Jackson retweeted
I joined @cursor_ai a few months ago to help build a fast model for agentic coding. Very excited the first version has shipped and can't wait to hear what you all think! x.com/cursor_ai/status/19835…

29 Oct 2025
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
12
9
213
36,201
Jacob Jackson retweeted
Excited to release our first model. We've been working on it for a while, and it came out of the oven pretty well! I've been enjoying daily driving it and I think you might too.
29 Oct 2025
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
1
1
35
3,096
Jacob Jackson retweeted
29 Oct 2025
Some exciting new to share - I joined Cursor! We just shipped a model 🐆 It's really good - try it out! cursor.com/blog/composer I left OpenAI after 3 years there and moved to Cursor a few weeks ago. After working on RL for my whole career, it was incredible to see RL come alive and be deployed in a product that millions of people use everyday in ChatGPT, and then honestly kind of surreal to see it surpass me at the types of tasks I regarded as markers of intelligence in o1/o3. To me, the next frontier is to bring the whole world of economically useful tasks in-distribution for RL At Cursor, we’re building a team focused on RL foundations and agentic coding - reach out if you’re interested in working with us!
29 Oct 2025
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
48
18
795
219,878
Jacob Jackson retweeted
It has been a joy working on composer with the team and watching all the pieces come together over the past few months I hope people find the model useful
29 Oct 2025
Composer is a new model we built at Cursor. We used RL to train a big MoE model to be really good at real-world coding, and also very fast. cursor.com/blog/composer Excited for the potential of building specialized models to help in critical domains.
6
2
131
20,701
Jacob Jackson retweeted
29 Oct 2025
We did a thing! cursor.com/blog/composer
29 Oct 2025
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
2
3
116
16,347