Xiangan He

Xiangan He

163 Photos and videos

Tweets

Xiangan He

@xBalbinus

These are 3 questions I ask before adding any new tool to my stack: - What permissions is it actually requesting, and do those make sense for what it does? - Does it touch any existing systems I rely on, and what happens to my data if that connection gets compromised? - Can I get the same result from a platform I'm already using, or do I actually need a new trust relationship here? Most tools get added the same way: something looks useful, you click through the permissions screen without reading it, and it's in your stack. That process is fine until it isn't, and "until it isn't" usually means a breach you didn't see coming. The diligence takes 10 minutes. The cleanup takes months. Stay safe out there.

187

Xiangan He

Xiangan He

@xBalbinus

Jun 10

A Vercel employee installed a third-party tool called Context AI for what was probably a routine engineering task. That one decision led to a highly sophisticated attacker compromising Vercel's Google Workspace, accessing environment variables across their infrastructure, and triggering one of the more alarming security postmortems I've read this year. The attacker used the access that employee had already granted the tool. That’s all it took. This is the part of the AI tooling conversation that almost nobody is having. We're in fast fashion land right now with apps. The volume of new tools being released is enormous, and the cultural expectation is to move fast, try things, and iterate. What that culture doesn't make room for is actually reading what you're granting access to when an app requests permissions to your Google Drive, your GitHub, or your email. Most users only have one question when they pick up a new tool: can it do what I need? Important question. But the question that matters more is what does this thing touch in my existing systems, and what happens if someone else gets to it before I do. That's the question the Vercel employee skipped. It's also one that the majority of founders skip every week.

292

Xiangan He

Xiangan He

@xBalbinus

Jun 9

Hiring a head of growth is one of the highest leverage moves I've made at XORS. I used to think distribution was something you figured out yourself. Grind the outreach, work the network, show up to things you already knew about. What I didn't account for is how much you don't know what you don't know. Our head of growth has made introductions I never would’ve made, gotten us into rooms I never would’ve found, and brought us to events that weren't even on my radar. Amazing doesn’t even begin to describe it. If you're around for ETH NYC or a16z tech week in New York, hit me up on X. Would love to connect with founders and builders in the space.

359

Xiangan He

Xiangan He

@xBalbinus

Jun 8

Would love to have folks here! Our first studio IRL event.

Eshita

@eshita

Jun 8

Hosting coffee for founders, operators, and builders in NYC this Thursday. DM if you'd like to join!

1,920

Xiangan He

Xiangan He

@xBalbinus

Jun 8

I recently saw this post from a founder that they were going to build an app every single day for a year. 365 apps. 365 days. Imagine saying something like that even just 12 months ago? Insane. But my first reaction wasn't "impressive." It was: how are you going to maintain 365 apps? We're in a weird era. Models are faster, vibe coding has lowered the bar to ship, and the culture is just: get it out, iterate in public, move fast. Because shipping is the easy part now, but maintaining is still hard. It still requires forethought, testing, documentation, and someone who actually cares when something breaks. The result is a growing graveyard of apps that technically launched but can't survive contact with a real user. Speed isn't the liability. Shipping without the intent to maintain is. PMF isn't found by shipping 365 things. It's found by shipping something worth maintaining. I’m not suggesting the shotgun approach doesn’t work. But the spread on your shotgun needs to be just a bit tighter. My opinion.

317

Xiangan He

Xiangan He

@xBalbinus

Jun 7

The engineers I respect most have surprisingly vanilla workflows. No elaborate AI orchestration or month-long sprint to find the perfect stack. Too much setup replaces raw skill with delegation. And delegating to a tool you haven't benchmarked is just organized guessing. My approach: try one thing, see if it beats the baseline, and take the next best step. It’s simple, iterative, and honest about what's working. The most sophisticated workflow isn't the best workflow. The one that actually makes you sharper is.

960

Xiangan He

Xiangan He

@xBalbinus

Jun 6

I don't have a driver's license. Most people find that surprising. Maybe even a downright bad life decision. But the way I think about it, you completely eliminate a class of problems by not opting into them in the first place. No car means no insurance, no maintenance, no parking tickets. Generally speaking, the more stuff you “own” the more it ends up owning you. That’s the way I see it, anyhow.

561

Xiangan He

Xiangan He

@xBalbinus

Jun 5

This might be the most underrated aspect of Claude. One of our clients started using it to iterate on product specs before handing them to us. They'd describe how their business operated, what data they needed, what the system should do, and let Claude help them structure it into something an engineering team could actually build from. By the time it landed with us, the spec was tighter than anything we'd seen from a non-technical team. How awesome is that? Domain knowledge is the variable Claude can't invent. When someone who deeply understands their operations and shares it with Claude, that’s incredibly helpful to us. At the end of the day, everything relies on the quality of our prompts. As obvious as that seems, it’s easy to forget.

280

Xiangan He

Xiangan He

@xBalbinus

Jun 4

4 dead giveaways of AI-generated code: - Dead/stale comments referencing old commits or bugs that no longer exist - API calls hard-coded to fetch exactly 100 results with no pagination logic - "God files" running 2,000 lines instead of separated components - Tests nowhere to be found: 3,000 lines of changes, barely any coverage Most engineers don't catch these at review because they look plausible at a glance, hard-coded round numbers feel like reasonable defaults, and missing tests feel like a time constraint someone meant to fix. But each one compounds. Stale comments get re-ingested by the next AI session and treated as ground truth. God files also become unmaintainable fast. That’s why the cleanup process of AI code is so important.

397

Xiangan He

Xiangan He

@xBalbinus

Jun 3

A lot of non-technical founders think Claude Code is what you get when you point the Claude API at a task. I’d disagree. Claude Code is a heavily engineered product, built on top of things the raw SDK doesn't give you: - Context compaction - Session maintenance - Hidden system prompts - A defined file path for session context that lives entirely outside the chat history None of that ships out of the box. Every piece of it was built deliberately by Anthropic's engineering team. The reason this matters is that I keep running into non-technical founders who look at what Claude Code can do and then ask why the AI feature in their own app doesn't perform at the same level. What they’re experiencing lives in the engineering layer around the model, not inside the model itself. This is one of the most underappreciated advantages of working with an actual engineering team. We can sit across from a founder, explain exactly why the behavior they're seeing is happening, and walk them toward a realistic path forward. A raw API integration can't do that.

381

Xiangan He

Xiangan He

@xBalbinus

Jun 2

This might seem harsh, but I’d argue otherwise. A founder came to me last month asking if they should bring on an offshore team to hit a 14-day ship deadline before their next funding round. My answer was no. The framing was: these developers are cheap, they're fast, and it's just to get something functional for the raise. But quality is king, and you get what you pay for. Speed without execution quality is a liability you're deferring. You pay for it right when it matters most: in a demo, in due diligence, the first week a real user touches it. The value of a dev shop is execution you can trust under pressure. If the execution isn't there, you don't actually have a timeline. You have a countdown.

752

Xiangan He

Xiangan He

@xBalbinus

Jun 1

Everyone evaluates the idea. Very few evaluate the people behind it. I get it. Ideas are easy to compare, and traction is easy to measure. You can put a number on MRR, but you can't easily put a number on judgment, or grit, or how someone operates when things go sideways. But here's what I've learned running XORS: a dev shop is a people company first. So is a product shop. The idea is just the vehicle, while the team is the engine. The best clients we've worked with wanted to understand how we think, communicate, and whether we'd tell them the truth when it wasn't what they wanted to hear. That's the stuff that actually determines outcomes. Bet on the people. The idea will evolve anyway.

292

Xiangan He

Xiangan He

@xBalbinus

Jun 1

Awesome product release!!

Ahmed Panju @ahmed_xors

May 30

Meant to post on LinkedIn all week. Then the week ended. Again. And Again. So I built Quillbird: tell it who to reach, it learns your voice, drafts posts infographics & auto publishes on your schedule. quillbird.io

110

Xiangan He

Xiangan He

@xBalbinus

May 30

Saw this and had to react… This is one of the most insane agent decisions I've seen posted publicly. Give an agent full access to a clone of production? Still kind of suspicious since production data might contain sensitive user information… but sometimes that's how you find out what it's actually capable of. Give an agent full access to actual production? That's a single bad tool call away from a company-ending incident. The lesson from "it made the agent 10x more useful" is that your sandbox wasn't realistic enough. Fix the sandbox. Don't remove the guardrails. x.com/snowmaker/status/20596…

Jared Friedman

@snowmaker

May 27

One night I quietly gave our AI agent full access to YC's production database. It made the agent 10x more useful. That's what convinced me that trust-by-default is the only way to get the most out of agents.

419

Xiangan He

Xiangan He

@xBalbinus

May 29

The cert industry is broken. It charges $1,749 to fail an exam, then $1,749 again to retake it. The "prep tools" you're supposed to buy on top of that are flashcard apps that haven't changed since 2010. That's the business model. Steep pricing, closed-source content, and zero adaptive tooling. We built Whetstone to fix it. It’s AI-native, Socratic, and scenario-based. It teaches you to think instead of pattern-match. We demoed it with engineers at Lazer Technologies and got 10 people to pass the Claude Code cert using it. Today we're open-sourcing the whole thing. teacher.up.railway.app/ If you've ever paid to retake an exam because the prep tool didn't actually teach you anything, this is for you. Grab some time on my calendar below if you want to get your org to pass the exam👇

542

Xiangan He

Xiangan He

@xBalbinus

May 29

calendly.com/xbalbinus/30min

Xiangan He

Xiangan He

@xBalbinus

May 29

The TanStack hack should scare you. It certainly scares me. On May 11th, 84 malicious versions were published across 42 TanStack packages, including tanstack/react-router, 12.7M weekly downloads. The attacker didn't brute force anything. They literally just chained three vulnerabilities to hijack TanStack's own release pipeline and publish malicious packages as TanStack itself, with Valid SLSA Level 3 provenance. Sigstore verified it as well because, technically, the pipeline was legitimate. The payload harvested AWS credentials, SSH keys, GitHub tokens, Kubernetes service accounts, and Claude Code session history. Then self-propagated to 170 packages across unrelated namespaces. Oh, and if you revoked the stolen token, a dead-man's switch ran rm -rf ~/ on your home directory. This is exactly why the XORS starter kit is intentionally lean. Next.js, Tailwind, ShadCN, Elysia, Biome, TypeScript. That's it. Every dependency you add is a new trust relationship. You're not just installing a package, you're inheriting the security posture of every maintainer, contributor, and workflow that touches it. Leaner dependency graph = smaller attack surface. The best security decision is often just asking yourself: do you actually need this?

360

Xiangan He

Xiangan He

@xBalbinus

May 26

Quick update on the CCA-F prep tool we built: Lazer Technologies (one of the top dev shops in the world) ran their team through it. 10 engineers took the exam. 10 passed. A new module dropped mid-prep called Agentic AI Tools that wasn't on any practice test. We patched it in, the team adjusted, and the results held. Also, I personally have taken the test and passed the exam (see screenshots in comments below).

0:14

499

Xiangan He

Xiangan He

@xBalbinus

May 26

101

Xiangan He

Xiangan He

@xBalbinus

May 22

Building the product is the easy part… Getting in front of people who might actually pay for it… that's where things get uncomfortable. And because it's uncomfortable, a lot of builders just don't do it. They keep polishing, adding features, and call it “focus.” But here's the thing: a decent product with real momentum will beat a perfect product that nobody knows about. So if you're building something right now and you haven't talked to the people who might use it, that's the bottleneck. Not the code.

362