Principal Architect - MWG, Head of Engineering - 🥑 Avocado Guild, theasianparent, Gumi Asia

Joined April 2008
610 Photos and videos
One of the best action movies I’ve seen in a while.
Oh boy! Just went with a bunch of #ManofTommorrow cast & crew to see #TheFurious. I didn't think Kenji Tanagaki could outdo himself after the spectacular Walled In, but man The Furious really showed him as one of the best action filmmakers working. We all loved it!
38
OpenAI already has a Mythos-class model. GPT Pro.
GPT-5.6 creates an interesting situation If it's weak, anthropic strengthens its lead in the intelligence conversation If it's strong, it runs into the same reality as fable or mythos
1
55
Looks terrible for $500k
Couple spends over S$500,000 to renovate Hougang 5-room flat into 'luxury home' bit.ly/4xI6xDI
1
2
155
Alvin De Cruz retweeted

171
1,830
11,054
4,350,008
Finally got the new Siri activated. Complete 180 from old Siri. Some stuff that I've tested: 1. Open up Instragram, go through stories, and Siri can tell you about a story, location, what's in it. 2. Siri can read what's on your chess.com app, and suggest moves. 3. Can make recommendations based on past communications from history on messages and mail. 4. As I'm going through PDFs, X posts, or anything on screen, Siri can read it, and explain. 5. Whatever you can do with ChatGPT, works here. 6. Vibe create shortcuts that you find important. 7. No issues with understanding what I say. Gets it right 99% of the time. 8. New workflow: take notes of what you read on-screen, with an explainer. Integration with other apps such as Teams, Whatsapp, Outlook don't seem enabled yet. Possibly in a later beta, or the 3rd party apps need to update their apps to conform to the new Siri.
2
45
Alvin De Cruz retweeted
u should actually feel lucky china isn't winning the frontier AI race, else we all would've had to put up with routine authoritarian practices such as topic-based output censorship, government-enforced access restrictions, and intelligence centralization in the hands of a few.
24
48
477
17,857
Bigger man than me
SIA air stewardess, 35, becomes fishmonger to turn around failing seafood business ex-boyfriend left behind bit.ly/4xm9aux
1
55
Alvin De Cruz retweeted
Paper: arxiv.org/abs/2510.08338 I wrote "90% accuracy." That's not quite right. The real number is correlation attainment, the AI panel hits ~90% of human test-retest reliability, i.e. ~90% of the way to how consistent real people are with their own answers on a retest. Different claim, and honestly a more impressive one. Two more things I should've been precise about: - it measures survey purchase intent - it only works best in categories the model already knows well PyMC Labs Colgate-Palmolive, 2025
6
25
193
23,227
This isn’t a serious company.
NEW: Anthropic is walking back Claude Fable 5's policy to covertly degrade performance for competing AI researchers, after facing fierce backlash. “We’re changing Fable 5’s safeguards for frontier LLM development to make them visible,” Anthropic tells WIRED. “We made the wrong tradeoff and we apologize for not getting the balance right.”
4
1
13
301
Apple doing things with Siri AI that not many can do now.
1
1
122
The trick is start small and use as much deterministic processes as you can to save on token usage. Eg; Do you really need an AI to scan through your whole JIRA board, or a simple JQL to pull relevant tickets based on your criteria is good enough and faster too? Is Opus really needed for this, when Sonnet suffices for such a task? Do heartbeats really even need Sonnet or can Haiku suffice when you’re only making API queries which respond with status flags?
Here’s your monthly reminder that you shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.
1
169
I too, never understood why people still claim that AI models can’t code.
I have said this before, but to those of us using AI systems to get lots of work done reliably and quickly, the people who post online about how AIs still hallucinate constantly, about how they can’t write code, etc., seem equivalent to people trying to convince you that the car you drive to work every day doesn’t exist. You tell them things like “but I drive a car. I paid money for it. I buy gasoline for it. I could not possibly be working twenty miles away from home if I didn’t have the car?” and they reply that you are imagining having a car, or that you’re lying because you work for a car company. It is as though these people live in a completely different reality.
2
88
The next round of layoffs are coming up, aren’t they?
抛开几年前元宇宙 被割的经历, $META 是我唯一不碰的 美国互联网公司股票,相比 Google来说,我觉得这家股票严重高估 看完这个视频,我觉得我要去做空它了
1
75
On the other end of the spectrum, raising grant money in a certain country (not SG) in Southeast Asia always involved paying a consultant a certain sum of money, and included giving some cash to some in the decision-making committee just to push things your way.
I was once pitching in a board room at a top 3 VC firm for a $15M Series A. 12 people in the meeting. One of the GPs fully fell asleep. Out cold for 30 minutes. Nobody acknowledged it. Everyone just kept going. I kept presenting my Series A slides to an unconscious man in a Herman Miller chair and somehow that was considered normal. That's venture capital. You might fly across the country to perform for people who may or may not be conscious. It's a dance. And sometimes you lead and sometimes you follow and sometimes your partner is unconscious. If you're raising right now, just know: every founder has a story like this. The process is weird. The power dynamic is weird. You're not crazy for thinking it's weird. No one talks about it because they want to continue raising. But I'm happy to stick my neck out there. It is weird.
2
146
Just started running this with a workflow to check logs across all my AWS services for anomalies/critical errors, match them against their respective codebases and databases records and start writing tickets to address them along the way in a separate workflow.
2
487
Alvin De Cruz retweeted
Jun 2
Workflows are the biggest upgrade to Claude Code’s capabilities since skills and subagents. I dove deep into it with @sidbid to figure out best practices, examples and more.  I’m particularly excited about the non-technical tasks it enables for Claude Code.
177
364
4,693
989,326
Alvin De Cruz retweeted
been working on Harness, a macOS terminal. built it all from the metal up. no libghostty, no tmux, no cmux - all one custom build. It’s a terminal built for speed and agentic workflows.
20
5
167
79,333
Alvin De Cruz retweeted
Several sleepless nights later, M3 is finally here. Coding frontier. 1M context. Native multimodal input. The first open-weights model to bring all three together. Hope you all like it 😎 p.s. M3 has already submitted a lot of PRs into MiniMax Code.
Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscrib… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days
77
57
1,322
68,023
Alvin De Cruz retweeted
May 31
beyond what ai tools can generate for us, i've been thinking a lot about the limitations of the interfaces we're using. today we flatten instructions, project context, and references into the same chat box, hoping the model figures out what to do with each one. what if you could explicitly organize and rank them instead?
11
3
168
11,546